Token Burn Comparator

Estimate daily & monthly cost by model based on your average prompt/output length and volume. Prices are placeholders—edit inline.

Avg prompt tokens Avg output tokens Requests / day Days / month

Model	In $/1k tok	Out $/1k tok	Daily $	Monthly $

Edit prices inline. Numbers are yours, not gospel.

What this does
Compares models by $ per successful pass, not “tokens per minute” cosplay. You’ll see cost, latency, and retry pain side-by-side so you can pick speed, spend, or balanced on purpose.

Why it matters
Vendors price tokens. You pay for passes—including retries, tool calls, and oops-do-it-again. Token math flatters the wrong model.

How to run it

Pick 3–5 real tasks (not prompts you’d never ship).
Set a clear pass condition for each (one line: what counts).
Lock inputs: same prompt, temperature, tools/RAG on or off for all models.
Run the set; the tool tracks $ / pass, p95 latency, retry count.
Re-run one task to check drift. If scores swing wildly, that’s the score.

Outputs

Cost per pass (not per token)
Time per pass (p95)
Retries per pass (and where they happened)
Pick one: cheapest, fastest, or best balanced—plus the trade-off you’re making

Gotchas

Caching can fake speed. Warm both models or turn it off.
Tool use isn’t free—include it.
Context window ≠ quality. Don’t reward bigger windows for doing nothing.

Good for
PMs, indie toolmakers, anyone sick of “$0.20 / 1M tokens” billboards.

Tip
Decide your bias before you run (speed vs spend). Otherwise you’ll cherry-pick.