What does your RAG setup cost to build + run?
Indexing, re-indexing, query-side embeddings, vector storage. Compare 9 embedding models side-by-side.
📖 What this is / how to use
Compare 9 embedding models on indexing + query-time cost for your RAG corpus.
- Embedding choice affects not just cost but retrieval quality — see both at once
- Separate indexing cost (one-time) from query cost (recurring) — most people conflate them
- Compare OpenAI, Voyage, Cohere, Gemini, BGE, Nomic side-by-side
📊 How it works (diagram)
What goes into the vector database.
Storage note: Pinecone, Weaviate, Qdrant, pgvector - pricing varies widely. Typical: $0.025-$0.30 per GB/mo. Our vector DB cost guide breaks down each option.
Same corpus + queries, different embedding provider. Current selection highlighted.
| Model | Dimensions | Max input | Indexing cost | Monthly cost | Annual cost |
|---|
- 🧬 Plan embedding budget — Index is a one-time spike. Queries are recurring. See the split before picking a provider.
- 🔄 Justify reindex cadence — Quarterly vs monthly reindex changes annual cost 4x. Math decides the schedule.
- 🆚 Compare 8 embedding providers — OpenAI, Cohere, Voyage, Mistral side-by-side. Pricing varies 10x across providers.
- 🔌 Integrate with your AI agents — MCP available for agentic workflow integration. Cost-aware embedding pipelines.
- Index once, query forever — indexing is a near one-time cost; the recurring spend is query-side embeddings, so optimize there first when queries/month is high.
- Right-size the model — a cheaper or smaller-dimension model (e.g. OpenAI 3-small or a truncated 3-large) often keeps recall while cutting both $/1M and vector storage.
- Batch the indexing job — if a re-index can wait hours, the Batch tier is ~50% off on providers that publish it.