Architecture
Ingest-to-dashboard freshness SLO: a practical operations playbook
Freshness is a release gate. If telemetry lag is unknown, root-cause and budget decisions are delayed.
Full guide: LLM cost attribution: endpoint, prompt version, tenant, and user
Define freshness as a measurable contract
Freshness is the delay between ingest timestamp and first visible dashboard summary timestamp.
A simple baseline is P95 <= 5 minutes for production traffic.
Synthetic validation workflow
- Send tagged synthetic requests every 5-10 minutes.
- Record ingest time and first dashboard visibility time.
- Compute P50/P95 freshness daily.
- Alert when freshness SLO breaches occur.
Typical failure modes
- Aggregation worker delays or restarts
- Backpressure after burst traffic periods
- Schema mismatch causing partial ingest drops
- Clock or timezone mismatch in comparison windows
Operational response runbook
- Check health and diagnostics endpoints first.
- Confirm worker processes and recent error logs.
- Contain by pausing non-critical dashboards if needed.
- Recover and document SLO breach timeline.
What to send (payload example)
{
"externalRequestId": "req_01HZXB6MQZ2WQ9D2KCF9M4V2QY",
"provider": "provider_id",
"model": "model_id",
"endpointTag": "checkout.ai_summary",
"promptVersion": "summary_v3",
"userId": "tenant_acme_hash",
"inputTokens": 540,
"outputTokens": 180,
"latencyMs": 892,
"status": "success",
"dataMode": "real",
"environment": "prod"
}Common mistakes
- Missing endpointTag or using inconsistent naming across teams.
- Not tagging promptVersion, so deploys cannot be linked to spend changes.
- Sending raw user identifiers instead of hashed mapping for privacy.
- Mixing demo/test dataMode into production operational reviews.
How to verify in Opsmeter Dashboard
- Use Overview to confirm spike window and budget posture.
- Use Top Endpoints to find feature-level concentration.
- Use Top Users to find tenant-level concentration.
- Use Prompt Versions to validate deploy-linked cost drift.
Related guides
Evaluation resources
For security and procurement reviews, use our trust summary before final tool selection.