Guides → Playground & Guide → TCO Complete - 7-Step Procurement-Grade Wizard for AI Workloads

TCO Complete - 7-Step Procurement-Grade Wizard for AI Workloads

Meet Marcus Chen. Director of FP&A preparing a multi-year AI procurement case. "I need a TCO model the procurement committee will accept - workload-specific, with sensitivity analysis and a defensible NPV."

🔥 Engineering hands me a $30K/mo number. Procurement asks for a 36-month TCO with NPV, payback, and risk-adjusted scenarios. Wide gap.

The story

Quick gives a board-ready number. Complete gives a contract-ready model. TCO Quick estimates total cost in 90 seconds with 5 inputs - perfect for board updates. TCO Complete is the procurement-grade version: 7 steps, calc handoffs, sensitivity analysis, NPV/IRR/Payback, persona-tuned executive synthesis. Use it when the contract is real.

Marcus's case: $30K/mo inference grows to $85K loaded TCO (Quick). But procurement wants 36-month NPV, best/expected/worst trajectories, and a defensible production-readiness uplift across security + compliance + observability. Quick can't get there. Complete can. The wizard composes the existing pure calcs (inference economics, agentic workflow, RAG pipeline, etc.) into a single procurement document.

7 steps, each feeding the next. (1) Context - workload + vertical + cloud + persona. (2) Inference economics - model + scale + caching + batching. (3) Capability stack - composable: retrieval, voice, agentic, fine-tuning, multimodal, evaluation. (4) Scale + finance trajectory - growth model + budget cap + NPV/IRR/Payback. (5) 6-pillar uplift - security + compliance + observability + PII + HITL + cost controls. (6) ROI sensitivity - tornado chart across volume, growth, pricing, HITL, PII, compliance. (7) Synthesis - persona-tuned executive plan (CFO/CTO/Founder/PM).

🎮 Playground

TCO Complete Playground

Here are the inputs that move the result the most. Play with the sliders and check it out. The number updates live.

Does this project pay off? (NPV)

Discounts growing monthly benefits over the horizon back to today’s dollars, minus the upfront investment. Investment and benefit fixed at defaults.

Horizon (months) 36

Benefit growth / mo 5

Discount rate 12

Net present value

—

💡NPV = present value of growing monthly benefits over the horizon − upfront investment. Positive means it pays off.

Three real scenarios

Same calculator, three team sizes. Click a tab to see how the numbers shift.

$50,000 / month ≈ $600,000 / year

Mid-stage SaaS adding AI to existing product. $250K capex + $30K/mo inference at start, growing 5%/mo. $50K/mo benefit (cost displacement + revenue lift). 12% discount rate. NPV at 36mo should clear $500K with payback ~15mo.

Healthy range: NPV positive at 36mo, payback <18mo

See inputs used

horizonMonths: 36
growthRatePctPerMo: 5
discountRatePct: 12
investmentUsd: 250,000
benefitMonthlyUsd: 50,000

Use cases

Same calculator, different applications. Sizes above, workloads here. Pick the one that looks like yours.

Pre-loaded scenarios for the most common applications. Click a tab to see realistic numbers, then hit "Try this scenario" to load it into the calculator above.

$75,000 / month ≈ $900,000 / year

Healthcare AI for documentation automation. HIPAA + audit + SOC 2 push 6-pillar uplift to ~40% of baseline. Step 5 makes this visible. Procurement committee needs the line-item breakdown.

Healthy range: NPV positive with 6-pillar uplift fully on

See inputs used

horizonMonths: 36
growthRatePctPerMo: 4
discountRatePct: 14
investmentUsd: 500,000
benefitMonthlyUsd: 75,000

Ready to run the numbers?

Open the full calculator. Pick a model, enter your tokens, see per-call, daily, monthly, and annual cost.

🚀 Open the full calculator →

Top vendors driving inference cost in your stack

Verified 13 hours ago

1

GPT-5 Mini

$0.250 in · $2.00 out ·
2

gpt-5.1-codex-mini

$0.250 in · $2.00 out ·
3

Command

$1.00 in · $2.00 out ·

About this calculator: TCO Complete - 7-Step Procurement-Grade Wizard for AI Workloads

Build a defensible AI TCO model in 15 minutes. Workload × vertical × cloud aware, 6-pillar uplift, NPV/IRR/Payback, persona-tuned executive plan.

📋 Typical values & starting points Don’t know a value yet? Start with these broad, sourced ballparks — the calculator’s ▾ Typical menus pre-load the same options.

Input	Default	Typical ballparks
`horizonMonths` moves the needle	36	—
`growthRatePctPerMo` moves the needle	5	—
`discountRatePct` moves the needle	12	—
`investmentUsd`	250,000	—
`benefitMonthlyUsd`	50,000	—

Ballparks are broad industry starting points (sourced ranges; * = rough estimate) — your result gets more accurate as you replace them with measured numbers. Try them live in the calculator; API & agent users get the same data from the MCP resource aicost://input-reference/tco-complete.

Reading your result

The 7-step output is a procurement document, not a number. You get a multi-page executive plan, NPV/IRR/Payback, sensitivity tornado, and 6-pillar uplift breakdown.

Persona-tuned synthesis. CFO sees finance front-and-center (NPV, payback, scenarios). CTO sees tech debt and capability mix. Founder sees runway impact and risk premium. PM sees roadmap dependencies.

Calc handoffs are first-class. Step 2 calls the inference economics calc. Step 3 composes 6 capability calcs (retrieval, voice, agentic, fine-tuning, multimodal, evaluation). Step 4 invokes the trajectory + NPV engines. No double-counting. Each step's output becomes the next step's input.

ToolsInfo drill-downs are linked. Each pillar in the 6-pillar uplift links to ToolsInfo for vendor selection. You pick the vendor; we model the cost.

What "good" looks like:

Time to complete: 5-15 minutes (varies by capability depth)
Resumable: wizard state persists by plan_id; come back anytime
Output sections: 7 step results + executive synthesis + sensitivity tornado
NPV horizon: 12-60 months (36 default)
Vs. TCO Quick: 5× more depth, 10× more inputs, procurement-grade rigor

What this calculator can't tell you

Honest limitations. Every model is wrong; some are useful. Where this one falls short:

FTE loaded cost varies by location ($15K/mo offshore to $40K/mo SF). Step 2 + Step 5 use averages.
NPV assumes constant discount rate over horizon. Doesn't model rate shocks.
Growth model is exponential - real workloads often have S-curves or plateaus. Use scenarios to bracket.
Step 5 pillar uplift percentages are industry typical, not your exact vendors. Use ToolsInfo drill-downs for vendor-specific quotes.
Risk premium quantified in Step 6 tornado but not added to base TCO automatically (you choose).

For these, use: Cost Calculator for inference detail. Concentration Risk for explicit risk modeling. TCO Quick if you need a 90-second board number.

Trade-offs

Cost isn't the only dimension. Click any constraint to see how recommendations change.

What matters most to you? Click any dimension — recommendations update.

Best fit for "cost":

All 7 steps feed the cost model No double-counting
6-pillar uplift typically 25-40% of inference Industry-dependent
NPV is the procurement-grade number Not monthly inference

The wizard's job is to bridge engineering's monthly inference number with procurement's multi-year defensible NPV. Both are correct at their scope; the gap is what TCO Complete fills.

Everything above is the 80% case. The last 20% is where the money is.

The gaps we just listed are real, and they are the expensive ones: your actual prompts, your switching cost, your MLOps overhead. An AICost expert spends the hour on your AI and cloud costs, not a generic playbook. You leave with a written report: the way forward, in 30 days of concrete steps.

An hour with the people who built the engines
Report the same day
The fee credits toward any AICost plan
Two slots a week

Book a Solution Session: $299 → or $99 for small business →

Not sure yet? The $39 AICost Blueprint credits toward a Session, and the Session fee credits toward any plan. You never pay twice for the same ground. See all pricing →

Ready to run your own numbers?

You have seen the shape of it. Open the calculator with your model, your tokens, your volume.

🚀 Open the full calculator →

Where to go next

Faster board-ready estimate →

5 questions, 90 seconds. Use when contract isn't yet on the table.

Self-host vs API economics →

Inform Step 3 inference economics decision.

Outcome-priced vendor vs build →

Inform Step 1 workload framing.

12-month detailed forecast →

Step 4 alternative for shorter horizons.

Methodology

Source: /ai-cost-economics
Extraction: 7-step composition validated against 12 procurement cases (anonymized). NPV/IRR engine uses standard discounting (test-covered).
Editorial gate: 8-layer defense, see aicost.ai/ai-cost-economics
Last verified: 7/20/2026, 8:00:00 PM

Author: Subu Vdaygiri, Founder & CEO of CloudIntelligence.ai. 17 years Fortune 100 (Ingram Micro, Siemens). Wharton CTO program · Kellogg CPO program · 10× AWS+Azure certified.

3 years of pricing history

Why this matters: pricing for major vendors has dropped 40-90% in the last 24 months. A budget set 12 months ago is probably wrong by 30%+.

View 3-year history for →

📖 Data sources & methodology 158 text models · 9 embeddings · 37 vision · 55 audio · 8 vector DBs across 10 vendor pages · last verified 2026-07-21

Methodology

All prices are USD per 1 million tokens, current as of 2026-07-21.
Vendor-published values have no mark. Inferred/extrapolated values are marked with * and listed below.
Batch API discounts are 50% off standard rates across providers that offer Batch mode.
Prompt caching discounts vary by provider (typically 80-90% off cached input tokens).
Regional data-residency surcharges (Anthropic 1.1x, OpenAI 1.1x, Google regional tiers) are NOT included in base rates.
Long-context pricing tiers apply when input exceeds model threshold.
Embedding prices are input-only (no output tokens generated).

Primary sources

Last-verified date is the most recent successful daily snapshot (aicost_pricing_snapshots) or, when no snapshot exists yet, the latest successful crawler run (aicost_crawler_runs). 10 of 10 vendors are currently verified. Aggregator services (TokenCost, AI Pricing Guru, etc.) are not listed.

Anthropic

2026-07-21

https://www.anthropic.com/pricing

Daily snapshot since Sep 2023 · 624 days captured

Anthropic Docs

2026-07-21

https://platform.claude.com/docs/en/about-claude/pricing

Daily snapshot since Sep 2023 · 624 days captured

OpenAI

2026-07-21

https://openai.com/api/pricing/

Daily snapshot since Sep 2023 · 625 days captured

Google AI

2026-07-21

https://ai.google.dev/gemini-api/docs/pricing

Daily snapshot since Dec 2023 · 600 days captured

Google Vertex

2026-07-21

https://cloud.google.com/vertex-ai/generative-ai/pricing

Daily snapshot since Dec 2023 · 600 days captured

DeepSeek

2026-07-21

https://api-docs.deepseek.com/quick_start/pricing

Daily snapshot since May 2024 · 539 days captured

xAI

2026-07-21

https://x.ai/api

Daily snapshot since Nov 2024 · 457 days captured

Mistral

2026-07-21

https://mistral.ai/pricing

Daily snapshot since Dec 2023 · 598 days captured

Cohere

2026-07-21

https://cohere.com/pricing

Daily snapshot since Sep 2023 · 624 days captured

Voyage AI

2026-07-21

https://docs.voyageai.com/docs/pricing

Inferred values (marked with * in calculator tables)

Derived from industry conventions, not directly published by the vendor. Typical conventions: cached input = 10% of base (90% off), Batch API = 50% of base (50% off).

Vendor / Model	Field	Why it’s inferred
Anthropic — Claude Sonnet 4.6	`cachedInput`	Derived at 10% of input rate — Anthropic publishes 90% cache-hit discount on this tier.
Anthropic — Claude Sonnet 4.5	`cachedInput`	Derived at 10% of input rate; same 90% cache-hit convention as Sonnet 4.6.
Anthropic — Claude Sonnet 4.5	`batchInput`	Derived at 50% of standard input — Anthropic documents uniform 50% Batch discount.
Anthropic — Claude Sonnet 4.5	`batchOutput`	Derived at 50% of standard output — Anthropic documents uniform 50% Batch discount.
Anthropic — Claude Haiku 4.5	`cachedInput`	Derived at 10% of input rate — Anthropic 90% cache-hit discount convention.
OpenAI — GPT-5.4 Mini	`cachedInput`	Derived at 10% of input — OpenAI documents automatic 90% discount on cache hits across GPT-5.x tier.
OpenAI — GPT-5.4 Nano	`cachedInput`	Derived at 10% of input — OpenAI 90% cache-hit convention.
OpenAI — GPT-5.4 Nano	`batchInput`	Derived at 50% of input — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Nano	`batchOutput`	Derived at 50% of output — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Pro	`cachedInput`	Derived at 10% of input — OpenAI 90% cache-hit convention.
OpenAI — GPT-5.4 Pro	`batchInput`	Derived at 50% of input — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Pro	`batchOutput`	Derived at 50% of output — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.2	`cachedInput`	Derived at 10% of input; no residency uplift.
OpenAI — GPT-5.2	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.2	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5	`cachedInput`	Derived at 10% of input.
OpenAI — GPT-5	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5.5 Pro	`cachedInput`	Derived at 10% of input — OpenAI does not publish a cached rate for *-pro models; using the family convention.
OpenAI — GPT-5.5 Pro	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.5 Pro	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5.2 Pro	`cachedInput`	Derived at 10% of input — pro-tier convention.
OpenAI — GPT-5.2 Pro	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.2 Pro	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5.1	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.1	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5 Pro	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5 Pro	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5 Nano	`cachedInput`	Derived at 10% of input.
OpenAI — GPT-5 Nano	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5 Nano	`batchOutput`	Derived at 50% of output.
Google — Gemini 3 Flash	`cachedInput`	Derived at 10% of input — Google caching discount convention ~90%.
Google — Gemini 3.1 Flash-Lite	`cachedInput`	Derived at 10% of input — Google caching convention.
Google — Gemini 3.1 Flash-Lite	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 3.1 Flash-Lite	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.5 Pro	`cachedInput`	Derived at 10% of input.
Google — Gemini 2.5 Flash	`cachedInput`	Derived at 10% of input.
Google — Gemini 2.5 Flash-Lite	`cachedInput`	Derived at 10% of input — Google caching convention.
Google — Gemini 2.5 Flash-Lite	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.5 Flash-Lite	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash	`cachedInput`	Derived at 25% of input per Google 2.0 family caching rates.
Google — Gemini 2.0 Flash	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash-Lite	`cachedInput`	Derived at 10% of input — Google caching convention.
Google — Gemini 2.0 Flash-Lite	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash-Lite	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
xAI — Grok 4 (legacy)	`cachedInput`	Extrapolated at 25% of base.

Pricing is cross-verified against the LiteLLM community registry when available. Daily snapshots are kept in aicost_pricing_snapshots; every change is logged to aicost_price_changelog with old & new values for full audit trail. Read the full methodology →

TCO Complete - 7-Step Procurement-Grade Wizard for AI Workloads

The story

TCO Complete Playground

Three real scenarios

Use cases

Ready to run the numbers?

Top vendors driving inference cost in your stack

About this calculator: TCO Complete - 7-Step Procurement-Grade Wizard for AI Workloads

Reading your result

What this calculator can't tell you

Trade-offs

Best fit for "cost":

Best fit for "hallucination":

Best fit for "compliance":

Best fit for "privacy":

Best fit for "latency":

Best fit for "vendor lock-in":

Best fit for "mlops overhead":

Everything above is the 80% case. The last 20% is where the money is.

Ready to run your own numbers?

Where to go next

Methodology

3 years of pricing history

Methodology

Primary sources

Inferred values (marked with * in calculator tables)

Immediate steps you can take