Accuracy

Summary

AtlasBurn's cost numbers are derived from provider-reported token counts and the pricing registry. We do not publish a fixed accuracy percentage today — the streaming-usage capture path changed materially in I3 (June 2026) and any pre-I3 figure would overstate accuracy. This page documents the methodology so you can reconcile against your own invoices.

Methodology over headline numbers

Earlier docs cited 94.2% against provider invoices. That number was measured before OpenAI streaming responses reported usage via stream_options.include_usage — pre-I3 streamed calls logged $0 tokens, so the original sample was biased. We removed the figure and are re-deriving it against current invoices.

How a cost is produced

The proxy or SDK observes the upstream response.
Token counts are taken from the provider's usage object. For OpenAI streams, the SDK and proxy inject stream_options.include_usage: true so the final SSE frame carries real counts.
If no usage is emitted (legacy stream, malformed response, or non-supported provider), tokens are estimated from request size and the event is flagged est/~est in the ledger.
The pricing registry maps (provider, model) to per-1M-token input/output rates. Unrecognized models fall back to FALLBACK_ECONOMICS (conservative rates).

Variance sources

Source	Impact	Mitigation
Estimated tokens (no usage frame)	Most material — event marked `est`	Use a client that supports `include_usage`, or route via the proxy
Context-cache / batch discounts	Discount visible only on the invoice	Reconcile monthly against the invoice
Provider-side rounding	Sub-cent variance per call	Accepted
Fallback pricing	Conservative overestimate on unknown models	Add the model to `PROVIDER_REGISTRY`

Reconciliation

Reconciliation aligns the Forensic Ledger with the provider invoice line items. Today this is manual via the dashboard; an automated Reconciliation API is planned.

What this means in practice

Treat AtlasBurn numbers as operationally accurate — good enough for guardrails, burn forecasting, and feature attribution. For financial reporting, reconcile against the upstream invoice.

Next Steps

Cost Engine — pricing formula and registry
Telemetry Architecture — including the Streaming Usage section
Glossary — observed spend, fallback premium, estimated tokens