Token Burn Comparator
Estimate daily & monthly cost by model based on your average prompt/output length and volume. Prices are placeholders—edit inline.
Model | In $/1k tok | Out $/1k tok | Daily $ | Monthly $ |
---|
Edit prices inline. Numbers are yours, not gospel.
What this does
Compares models by $ per successful pass, not “tokens per minute” cosplay. You’ll see cost, latency, and retry pain side-by-side so you can pick speed, spend, or balanced on purpose.
Why it matters
Vendors price tokens. You pay for passes—including retries, tool calls, and oops-do-it-again. Token math flatters the wrong model.
How to run it
- Pick 3–5 real tasks (not prompts you’d never ship).
- Set a clear pass condition for each (one line: what counts).
- Lock inputs: same prompt, temperature, tools/RAG on or off for all models.
- Run the set; the tool tracks $ / pass, p95 latency, retry count.
- Re-run one task to check drift. If scores swing wildly, that’s the score.
Outputs
- Cost per pass (not per token)
- Time per pass (p95)
- Retries per pass (and where they happened)
- Pick one: cheapest, fastest, or best balanced—plus the trade-off you’re making
Gotchas
- Caching can fake speed. Warm both models or turn it off.
- Tool use isn’t free—include it.
- Context window ≠ quality. Don’t reward bigger windows for doing nothing.
Good for
PMs, indie toolmakers, anyone sick of “$0.20 / 1M tokens” billboards.
Tip
Decide your bias before you run (speed vs spend). Otherwise you’ll cherry-pick.