Free • No sign-up • 12 models • Caching included

How much does that prompt actually cost?

Compare per-request and at-scale costs across every major AI model in 2026 — including the prompt-caching discounts most calculators ignore. Math runs in your browser.

Your prompt

Expected output length

words

0 input tokens·260 output tokens

Volume & caching

Requests per day1,000 / day

Monthly: 30,000 requests

Prompt-caching hit rate0%

How often your input prefix is reused (RAG context, system prompts, agent loops). Only applies to models with caching support.

Cost across 12 models

Verified May 28, 2026. At 1,000 requests/day, Gemini 3.5 Flash costs $2.340/mo vs $585.00/mo for Claude Opus 4.7 — that's a 250× difference.

Frontier only

Model

Tier

Per request

1,000 / day, monthly

Cache saves

Gemini 3.5 FlashGoogle · 1M ctxCache: 75% off

Budget / Fast

<$0.001$0 in + <$0.001 out

$2.340

set cache rate

GPT-5 miniOpenAI · 200k ctxCache: 50% off

Budget / Fast

<$0.001$0 in + <$0.001 out

$4.680

set cache rate

Llama 3.3 70B (Groq)Groq · 128k ctxFastest inference; no caching tier

Budget / Fast

<$0.001$0 in + <$0.001 out

$6.162

no cache tier

DeepSeek V3DeepSeek · 128k ctxOff-peak cached rate; standard cache is $0.10

Budget / Fast

<$0.001$0 in + <$0.001 out

$8.580

set cache rate

Claude Haiku 4.5Anthropic · 200k ctxCache: 90% off

Budget / Fast

$0.0010$0 in + $0.0010 out

$31.200

set cache rate

Gemini 3.5 ProGoogle · 2.0M ctxCache: 75% off implicit, 4-hour minimum

Frontier

$0.0013$0 in + $0.0013 out

$39.000

set cache rate

Mistral Large 2Mistral · 128k ctxEU-hosted, no caching tier

Mid-tier

$0.0016$0 in + $0.0016 out

$46.800

no cache tier

GPT-4oOpenAI · 128k ctxCache: 50% off

Mid-tier

$0.0026$0 in + $0.0026 out

$78.000

set cache rate

GPT-5OpenAI · 400k ctxCache: 50% off after first 1024-token prefix is reused

Frontier

$0.0039$0 in + $0.0039 out

$117.00

set cache rate

Grok 4xAI · 256k ctxNo caching tier as of May 2026

Frontier

$0.0039$0 in + $0.0039 out

$117.00

no cache tier

Claude Sonnet 4.6Anthropic · 200k ctxCache: 90% off, write costs 25% extra on first call

Mid-tier

$0.0039$0 in + $0.0039 out

$117.00

set cache rate

Claude Opus 4.7Anthropic · 500k ctxCache: 90% off, write costs 25% extra on first call

Frontier

$0.0195$0 in + $0.0195 out

$585.00

set cache rate

Estimates based on published API pricing as of May 28, 2026. Tokens approximated client-side (~80% accurate). Actual bills depend on exact tokenization, request volume discounts, and provider-specific rules. Re-check pricing with the provider before signing budgets.

How it works

Honest math, not vendor sales pitches

Token estimation

We count characters and divide by 4 — a ~80% accurate heuristic. For exact numbers, paste actual token counts from a tokenizer like tiktoken or cl100k.

Caching discount

Anthropic offers 90% off cached input. OpenAI offers 50%. Google offers 75%. Slide the cache-hit rate up to see what your real bill looks like for repeat-context workflows.

At-scale math

The slider scales from 1 request to 1M requests per day. We multiply per-request cost by 30 days for a monthly view — the number you actually need to pitch to finance.

How much does that prompt actually cost?

Cost across 12 models

Honest math, not vendor sales pitches

Token estimation

Caching discount

At-scale math

Cheaper bills start with shorter prompts.

Is that text AI-written?