Cost Engine
Summary
The Cost Engine resolves every telemetry event into a dollar amount using a per-model pricing registry, falls back to conservative rates for unrecognized models, and powers reconciliation against provider invoices.
Pricing Formula
All rates are standardized to USD per 1M tokens:
cost = (inputTokens / 1_000_000) * inputRate
+ (outputTokens / 1_000_000) * outputRateProvider Cost vs. Estimated Cost
| Metric | Source | When used |
|---|---|---|
| Provider Cost | PROVIDER_REGISTRY exact match | Model is recognized |
| Estimated Cost | FALLBACK_ECONOMICS (conservative) | Model unknown or no usage frame emitted (tokens estimated) |
Fallback Premium
PROVIDER_REGISTRY.Pricing registry — coverage
The registry currently prices models from:
- OpenAI, Anthropic, Google (Gemini / Vertex)
- Meta (Llama), Mistral, DeepSeek, xAI, Cohere
Note: Meta/Llama, Mistral, DeepSeek, xAI, and Cohere are priced but not auto-captured by the SDK today. Route them via the Edge Proxy or post events to /api/ingest.
Reconciliation
The normalization engine resolves provider-specific fields (prompt_tokens, input_token_count, usageMetadata.promptTokenCount) into a single deterministic schema before any comparison with an invoice.
Confidence
We are not publishing a fixed accuracy percentage at this time — see Accuracy for methodology and an explanation of why the previous figure was withdrawn after the streaming-usage capture fix.
Edge cases
- Rates are USD per 1M tokens — the canonical unit.
- A “Fallback” tag means the model was not in PROVIDER_REGISTRY.
- Events with estimated tokens carry an
est/~estflag in the ledger. - Vector-database costs are not yet in the normalization engine.
Next Steps
- Accuracy — methodology and variance sources
- Telemetry Architecture — including Streaming Usage
- Glossary — observed spend, burn rate, fallback premium