LLM Cost Calculator
Compare API costs across GPT-4, Claude, Deepseek, Gemini, and more. Find the most cost-effective AI model for your use case.
Usage Parameters
Configure your expected token usage to compare costs
1K1M
100500K
110K

Cheapest Option

Gemini 2.5 Flash

$67.50/month

Premium Model

Claude Opus 4.5

$15,750/month

Sort by:
Gemini 2.5 Flash

Google

Cheapest
Ultra Fast

Speed optimized, huge context

Input$0.075/M tokens
Output$0.3/M tokens
Context1000K tokens
Per Request$0.022
Daily$2.25
Monthly$67.50
Yearly$821
GPT-5 Nano

OpenAI

Cheapest

Ultra-low cost for simple tasks

Input$0.05/M tokens
Output$0.4/M tokens
Cached Input$0.005/M tokens
Context64K tokens
Per Request$0.025
Daily$2.50
Monthly$75.00
Yearly$913
DeepSeek V3.2

DeepSeek

Best Value

#1 open-source, exceptional coding

Input$0.28/M tokens
Output$0.42/M tokens
Cached Input$0.028/M tokens
Context64K tokens
Per Request$0.049
Daily$4.90
Monthly$147
Yearly$1,789
Mistral Medium 3

Mistral

Balanced

Great speed/cost balance

Input$0.4/M tokens
Output$1.2/M tokens
Context128K tokens
Per Request$0.100
Daily$10.00
Monthly$300
Yearly$3,650
GPT-5 Mini

OpenAI

Budget

Affordable & fast for everyday tasks

Input$0.25/M tokens
Output$2/M tokens
Cached Input$0.025/M tokens
Context128K tokens
Per Request$0.125
Daily$12.50
Monthly$375
Yearly$4,563
DeepSeek-R1

DeepSeek

Reasoning

Reasoning model rivaling o1

Input$0.55/M tokens
Output$2.19/M tokens
Cached Input$0.14/M tokens
Context64K tokens
Per Request$0.165
Daily$16.45
Monthly$494
Yearly$6,004
Claude Haiku 4

Anthropic

Fast

Fastest Claude for quick responses

Input$0.8/M tokens
Output$4/M tokens
Context200K tokens
Per Request$0.280
Daily$28.00
Monthly$840
Yearly$10,220
Mistral Large 3

Mistral

Flagship

675B params, top-tier performance

Input$2/M tokens
Output$6/M tokens
Context128K tokens
Per Request$0.500
Daily$50.00
Monthly$1,500
Yearly$18,250
GPT-5

OpenAI

Flagship

Latest flagship model, 400K context

Input$1.25/M tokens
Output$10/M tokens
Cached Input$0.125/M tokens
Context400K tokens
Per Request$0.625
Daily$62.50
Monthly$1,875
Yearly$22,813
Gemini 2.5 Pro

Google

1M Context

Best for complex reasoning & code

Input$1.25/M tokens
Output$10/M tokens
Context1000K tokens
Per Request$0.625
Daily$62.50
Monthly$1,875
Yearly$22,813
GPT-4o

OpenAI

Previous gen multimodal model

Input$2.5/M tokens
Output$10/M tokens
Cached Input$1.25/M tokens
Context128K tokens
Per Request$0.750
Daily$75.00
Monthly$2,250
Yearly$27,375
Gemini 3 Pro (Preview)

Google

Preview

Latest multimodal, 2M context

Input$2/M tokens
Output$12/M tokens
Context2000K tokens
Per Request$0.800
Daily$80.00
Monthly$2,400
Yearly$29,200
Claude Sonnet 4.5

Anthropic

Popular

Best for coding, agents & agentic tasks

Input$3/M tokens
Output$15/M tokens
Context200K tokens
Per Request$1.05
Daily$105
Monthly$3,150
Yearly$38,325
Claude Opus 4.5

Anthropic

Most Capable

Most intelligent Claude, infinite chats

Input$15/M tokens
Output$75/M tokens
Context200K tokens
Per Request$5.25
Daily$525
Monthly$15,750
Yearly$191,625
About LLM Pricing

💡 Tips to Reduce Costs

  • • Use prompt caching for repeated contexts
  • • Choose smaller models for simple tasks
  • • Optimize prompts to reduce token count
  • • Batch requests when possible

📊 Understanding Tokens

  • • ~4 characters = 1 token (English)
  • • ~750 words ≈ 1,000 tokens
  • • Output tokens cost more than input
  • • Context window limits total tokens

⚠️ Disclaimer: Prices are updated periodically and may not reflect the latest rates. Always check the official provider documentation for current pricing.