Free • No sign-up • 12 models • Caching included

How much does that prompt actually cost?

Compare per-request and at-scale costs across every major AI model in 2026 — including the prompt-caching discounts most calculators ignore. Math runs in your browser.

Your prompt
0 input tokens·260 output tokens
Volume & caching
Requests per day1,000 / day

Monthly: 30,000 requests

Prompt-caching hit rate0%

How often your input prefix is reused (RAG context, system prompts, agent loops). Only applies to models with caching support.

Cost across 12 models

Verified May 28, 2026. At 1,000 requests/day, Gemini 3.5 Flash costs $2.340/mo vs $585.00/mo for Claude Opus 4.7 — that's a 250× difference.

Model
Tier
Per request
1,000 / day, monthly
Cache saves
Gemini 3.5 FlashGoogle · 1M ctxCache: 75% off
Budget / Fast
<$0.001$0 in + <$0.001 out
$2.340
set cache rate
GPT-5 miniOpenAI · 200k ctxCache: 50% off
Budget / Fast
<$0.001$0 in + <$0.001 out
$4.680
set cache rate
Llama 3.3 70B (Groq)Groq · 128k ctxFastest inference; no caching tier
Budget / Fast
<$0.001$0 in + <$0.001 out
$6.162
no cache tier
DeepSeek V3DeepSeek · 128k ctxOff-peak cached rate; standard cache is $0.10
Budget / Fast
<$0.001$0 in + <$0.001 out
$8.580
set cache rate
Claude Haiku 4.5Anthropic · 200k ctxCache: 90% off
Budget / Fast
$0.0010$0 in + $0.0010 out
$31.200
set cache rate
Gemini 3.5 ProGoogle · 2.0M ctxCache: 75% off implicit, 4-hour minimum
Frontier
$0.0013$0 in + $0.0013 out
$39.000
set cache rate
Mistral Large 2Mistral · 128k ctxEU-hosted, no caching tier
Mid-tier
$0.0016$0 in + $0.0016 out
$46.800
no cache tier
GPT-4oOpenAI · 128k ctxCache: 50% off
Mid-tier
$0.0026$0 in + $0.0026 out
$78.000
set cache rate
GPT-5OpenAI · 400k ctxCache: 50% off after first 1024-token prefix is reused
Frontier
$0.0039$0 in + $0.0039 out
$117.00
set cache rate
Grok 4xAI · 256k ctxNo caching tier as of May 2026
Frontier
$0.0039$0 in + $0.0039 out
$117.00
no cache tier
Claude Sonnet 4.6Anthropic · 200k ctxCache: 90% off, write costs 25% extra on first call
Mid-tier
$0.0039$0 in + $0.0039 out
$117.00
set cache rate
Claude Opus 4.7Anthropic · 500k ctxCache: 90% off, write costs 25% extra on first call
Frontier
$0.0195$0 in + $0.0195 out
$585.00
set cache rate

Estimates based on published API pricing as of May 28, 2026. Tokens approximated client-side (~80% accurate). Actual bills depend on exact tokenization, request volume discounts, and provider-specific rules. Re-check pricing with the provider before signing budgets.

Honest math, not vendor sales pitches

01

Token estimation

We count characters and divide by 4 — a ~80% accurate heuristic. For exact numbers, paste actual token counts from a tokenizer like tiktoken or cl100k.

02

Caching discount

Anthropic offers 90% off cached input. OpenAI offers 50%. Google offers 75%. Slide the cache-hit rate up to see what your real bill looks like for repeat-context workflows.

03

At-scale math

The slider scales from 1 request to 1M requests per day. We multiply per-request cost by 30 days for a monthly view — the number you actually need to pitch to finance.