Operational Troubleshooting
Summary
Operational guide for diagnosing live AtlasBurn issues — guardrail kills, stuck KV flags, missing telemetry, and edge-proxy errors. For SDK install and framework-specific setup questions, see SDK Integration or Proxy Troubleshooting.
First check: dashboard health
My traffic is being rejected with 429
The edge proxy is returning 429 because a SUSPENDED flag is present in Cloudflare KV for your org.
- Open the dashboard → Forensic Ledger and find the kill event. It will name the triggering layer (Financial Limits, Dumb Loop, Smart Loop, RPS Burst, Premium Concentration).
- Fix the root cause in your agent (the ledger shows the offending fingerprint and the full request that tripped detection).
- Click Resume to clear the KV flag instantly, or wait for the 2-hour TTL to auto-expire. See Auto-Recovery & Manual Override.
The kill flag won't clear after I clicked Resume
DELETE /api/guardrails/enforce is idempotent and propagates to all Cloudflare edges in under a second. If traffic is still being rejected:
- Check that the request really is hitting the proxy at
proxy.atlasburn.comand not a stale cachedbaseURL. - Confirm the
AtlasBurn-API-Keyheader is set. Requests without it are rejected as unauthenticated, not as guardrail kills. - Re-trigger Resume. The dashboard surfaces the KV response — a 404 means the flag is already gone and the issue is elsewhere.
No telemetry events are arriving
- Using the Edge Proxy? Confirm
baseURLpoints tohttps://proxy.atlasburn.com/<provider>/v1and theAtlasBurn-API-Keyheader is set on every request. - Using the SDK? Confirm
initAtlasBurnAutoruns at boot (see SDK Integration) and that the process actually has time to flush before exit. - Check the Forensic Ledger filter — events default to the last 1 hour.
Guardrail keeps tripping on legitimate traffic
Tune the layer thresholds from Settings → Guardrails. Common cases:
- Batch jobs — raise the Layer 4 RPS ceiling or schedule the job with a dedicated key whose limits match its burst profile.
- Identical prompts by design (eval suites, prompt-regression tests) — add the fingerprint to the Layer 2 allowlist.
- Premium-only workloads — disable Layer 5 for the relevant API key.
Latency increased after enabling the proxy
The edge proxy adds <5ms in steady state (KV read + forward). If you see more:
- Verify your client is keeping connections alive (
keep-aliveheaders). - Check the request is hitting the nearest Cloudflare PoP — not a custom region pinned far from your runtime.
- Inspect the response
cf-rayheader to confirm edge termination.
Self-hosted proxy isn't reading KV
Re-check wrangler.toml:
[[kv_namespaces]]
binding = "GUARDRAIL_FLAGS"
id = "<your-kv-namespace-id>"The binding name must be GUARDRAIL_FLAGS — the worker looks up that exact identifier. After editing, redeploy with npx wrangler deploy.
Still stuck?
Open a support ticket from the dashboard with the org id and a sample kill event id — we can replay the full guardrail evaluation against your event stream.