Security playbook
Leaked API key cost spike: how to detect and contain damage
Key leaks turn into spend incidents quickly. The first hour determines how much financial damage you contain.
Full guide: Bot attacks and LLM cost spikes: prevention playbook
Immediate containment
- Rotate leaked provider key immediately.
- Disable affected client credentials.
- Block suspicious source patterns where possible.
- Capture spend window and top abusive endpoints.
How to confirm it is a key leak (fast signals)
- Spend rises without a corresponding product deploy or traffic campaign.
- Traffic originates from unfamiliar IP/region patterns.
- Unknown-user or anonymous traffic dominates expensive endpointTags.
- Request volume spikes while success rate looks normal (abuse can still succeed).
Diagnosis
- Compare normal tenant distribution versus incident distribution.
- Identify endpoints with sudden volume burst.
- Verify whether requests include known userId mappings.
- Measure cost increase before and after key rotation.
Contain the blast radius (do not rely on rotation alone)
- Move keys server-side only (never ship to browsers/mobile clients).
- Split keys by environment (prod/staging) and by app/service.
- Add per-endpointTag throttles on public or high-cost endpoints.
- Enable budget alerts and burn-rate checks for the incident window.
Recovery hardening
- Move sensitive keys to server-side only paths.
- Add anomaly alerts for burst traffic and concentration drift.
- Enforce key rotation policy and secret scanning in CI.
Post-incident checklist (prevent repeats)
- Add secret scanning rules for repos and build artifacts.
- Audit logs: identify how the key leaked (client bundle, logs, paste, CI).
- Add rate-limit defaults for public endpoints.
- Document an on-call runbook and an escalation owner.
Executive summary output
Document root cause, affected window, contained amount, and permanent controls to prevent recurrence.
What to alert on
- request burst with low identity diversity
- token-per-request surge without feature traffic growth
- retry ratio increase without an upstream outage explanation
- new high-cost endpointTag suddenly dominating spend
Execution checklist
- Confirm abuse signal: burst, key leak, prompt injection, or scraping.
- Rotate compromised keys and block abusive sources immediately.
- Apply per-endpoint rate limits and output caps to contain spend.
- Document dominant endpointTag, tenant/user concentration, and time window.
- Convert the incident into one permanent guardrail update.
FAQ
Is rotating the key enough?
Rotation stops the immediate leak, but it does not prevent recurrence. You also need to remove the exposure vector (client-side keys, logs) and add rate limits and budget alerts so you catch future misuse quickly.
How do we know which endpoint was abused?
Rank spend by endpointTag for the incident window, then compare to baseline. The abused endpointTag usually becomes the dominant driver quickly.
Should we block traffic during the incident?
For public or non-critical endpoints, yes: block or throttle immediately. For critical endpoints, use degraded mode and stricter identity checks while you isolate the dominant driver.
Related guides
Evaluation resources
For security and procurement reviews, use our trust summary before final tool selection.