Cost Engine

Summary

The Cost Engine resolves every telemetry event into a dollar amount using a per-model pricing registry, falls back to conservative rates for unrecognized models, and powers reconciliation against provider invoices.

Pricing Formula

All rates are standardized to USD per 1M tokens:

Pricing calculationtext
cost = (inputTokens  / 1_000_000) * inputRate
     + (outputTokens / 1_000_000) * outputRate

Provider Cost vs. Estimated Cost

MetricSourceWhen used
Provider CostPROVIDER_REGISTRY exact matchModel is recognized
Estimated CostFALLBACK_ECONOMICS (conservative)Model unknown or no usage frame emitted (tokens estimated)

Fallback Premium

Unknown models are priced at a deliberately high rate from FALLBACK_ECONOMICS. This overestimates cost so guardrails fail safe. A “Fallback” tag in the ledger means you should add the model to PROVIDER_REGISTRY.

Pricing registry — coverage

The registry currently prices models from:

  • OpenAI, Anthropic, Google (Gemini / Vertex)
  • Meta (Llama), Mistral, DeepSeek, xAI, Cohere

Note: Meta/Llama, Mistral, DeepSeek, xAI, and Cohere are priced but not auto-captured by the SDK today. Route them via the Edge Proxy or post events to /api/ingest.

Reconciliation

The normalization engine resolves provider-specific fields (prompt_tokens, input_token_count, usageMetadata.promptTokenCount) into a single deterministic schema before any comparison with an invoice.

Confidence

We are not publishing a fixed accuracy percentage at this time — see Accuracy for methodology and an explanation of why the previous figure was withdrawn after the streaming-usage capture fix.

Edge cases

  • Rates are USD per 1M tokens — the canonical unit.
  • A “Fallback” tag means the model was not in PROVIDER_REGISTRY.
  • Events with estimated tokens carry an est/~est flag in the ledger.
  • Vector-database costs are not yet in the normalization engine.

Next Steps