Paste your prompt. See what's wasteful.
Automated analysis of redundancy, verbosity, and low-value tokens. Typical prompts have 30-50% reducible tokens.
📖 What this is / how to use
Paste a prompt and see exactly which tokens are wasteful — politeness padding, redundant instructions, over-stuffed few-shot examples — and what trimming them saves at your volume.
- Most prompts carry 20-50% redundant tokens that add cost without improving output
- Token reduction stacks multiplicatively with routing and caching
- Analysis runs locally in your browser — your prompt is never sent anywhere
- See the dollar impact at your request volume across every model
These are the inputs, outputs, and how you can use this calculator for your AI workloads.
- Your promptThe text to analyze
- ModelModel used to price savings
- Requests per dayDaily call volume for this prompt
- After optimizationToken count once cuts are applied
- Potential monthly savingsDollars saved per month
- Token reductionShare of input you can cut
- FindingsRanked waste patterns
Cut padding, redundancy, and over-stuffed examples; 30-50% reduction is common
Consolidate instructions and prune few-shot down to what actually helps
Token cuts become real monthly and annual dollars at your volume
Run reductions as a step in your prompt build; MCP available for agents
👇 Now try the calculator below with your own AI workloads
📊 How it works (diagram)
System prompt, user template, or any repeated AI input. Analysis runs locally.
What you save on each model if you apply all suggestions.
| Model | Current / day | Optimized / day | Monthly savings | Annual savings |
|---|
- Apply the high-savings findings first — cut politeness padding, redundant instructions, and over-stuffed few-shot examples; 30-50% reduction is typical without touching quality.
- Re-test quality after trimming — run your eval set on the leaner prompt before shipping; aggressive cuts can drop accuracy on edge cases.
- Then cache what's left — a tight, stable prefix caches better, so prompt caching compounds the savings on top of the token cut.