Accuracy
Summary
AtlasBurn's cost numbers are derived from provider-reported token counts and the pricing registry. We do not publish a fixed accuracy percentage today — the streaming-usage capture path changed materially in I3 (June 2026) and any pre-I3 figure would overstate accuracy. This page documents the methodology so you can reconcile against your own invoices.
Methodology over headline numbers
Earlier docs cited
94.2% against provider invoices. That number was measured before OpenAI streaming responses reported usage via stream_options.include_usage — pre-I3 streamed calls logged $0 tokens, so the original sample was biased. We removed the figure and are re-deriving it against current invoices.How a cost is produced
- The proxy or SDK observes the upstream response.
- Token counts are taken from the provider's
usageobject. For OpenAI streams, the SDK and proxy injectstream_options.include_usage: trueso the final SSE frame carries real counts. - If no usage is emitted (legacy stream, malformed response, or non-supported provider), tokens are estimated from request size and the event is flagged
est/~estin the ledger. - The pricing registry maps
(provider, model)to per-1M-token input/output rates. Unrecognized models fall back toFALLBACK_ECONOMICS(conservative rates).
Variance sources
| Source | Impact | Mitigation |
|---|---|---|
| Estimated tokens (no usage frame) | Most material — event marked est | Use a client that supports include_usage, or route via the proxy |
| Context-cache / batch discounts | Discount visible only on the invoice | Reconcile monthly against the invoice |
| Provider-side rounding | Sub-cent variance per call | Accepted |
| Fallback pricing | Conservative overestimate on unknown models | Add the model to PROVIDER_REGISTRY |
Reconciliation
Reconciliation aligns the Forensic Ledger with the provider invoice line items. Today this is manual via the dashboard; an automated Reconciliation API is planned.
What this means in practice
Treat AtlasBurn numbers as operationally accurate — good enough for guardrails, burn forecasting, and feature attribution. For financial reporting, reconcile against the upstream invoice.
Next Steps
- Cost Engine — pricing formula and registry
- Telemetry Architecture — including the Streaming Usage section
- Glossary — observed spend, fallback premium, estimated tokens