Ship telemetry fast, keep attribution stable, and move from ingest to governance without changing your app network path.
Start with direct ingest, validate one stable payload shape, then harden budgets, retries, and incident workflows.
Implementation rhythm
Use quickstart for initial payloads, then operations docs to harden production workflows.
No SDK required today.
Direct ingest API is production-ready and supports all core workflows. SDK wrappers are optional convenience layers. Current package links and provider support live in the package section below.
Goal: find what caused your AI bill in the first 60-120 seconds after signup.
After sign-in, rotate a new key in Settings → API keys. Keys are shown only once. Then continue with a ready payload and send your first ingest call.
Post LLM request metadata to the ingest endpoint after each model call. Include externalRequestId on every request for idempotency. Treat it like a unique, per-request key you generate on your side.
Use the Dashboard to monitor spend, latency, and budget posture. Basic budget alerts are available on Starter and above plans; advanced alerts are available on Pro and above plans. Ingest responses include explicit status fields such as reason, telemetryPaused, providerCallsContinue, isPlanLimitReached, plus legacy budget flags (budgetWarning, budgetExceeded) for backward compatibility. See Limits & budgets and n8n integration for branching examples.
First-ingest checklist: API key ready, payload sent, response ok=true, row visible in Dashboard.
After first traffic, open Settings → First-ingest checklist to validate endpointTag, promptVersion, identity, unknown models, usage fields, and data-mode alignment.
Must-have fields: externalRequestId, provider/model, token usage source. If plan limit is reached, ingest returns 402 with reason=plan_limit_reached and telemetryPaused=true (telemetry pause only).
Pro and above plans include Tracking quality score for ongoing telemetry health.
Need copy-paste snippets? Open Integration examples (docs) or use the GitHub examples repo and send the sample payload first.
Plan limits apply to telemetry ingest only. If the request limit is reached, telemetry is paused and provider calls continue.
| Plan | Requests / mo | Alerts | Export | Filters / KPI |
|---|---|---|---|---|
| Free | 10k | No alert delivery | None | Basic only |
| Starter | 100k | Basic email alerts | CSV | No advanced filters / No prompt KPI |
| Pro | 500k | Email + webhook | CSV + JSON | Advanced filters + prompt KPI |
| Team | 2M | Email + webhook | CSV + JSON | Advanced + multi-workspace + RBAC |
| Enterprise | Custom | Custom policy | Custom | Custom governance controls |
Feature highlights by plan: Starter adds Investigate Spike, Alerts Inbox, and savings opportunities. Pro adds Prompt Impact compare, webhook delivery, and JSON export. Team adds feature analysis, tenant profitability, custom date-range board pack export, and weekly executive report.
Move from hello-world telemetry to production-grade instrumentation with clear guardrails.
Generate one ID per LLM call and reuse it on retries. Store it in request context (middleware/local variable) so the same ID flows through every retry.
const ctx = { externalRequestId: existingId ?? null };
const externalRequestId = ctx.externalRequestId ?? crypto.randomUUID();
ctx.externalRequestId = externalRequestId;
for (let attempt = 0; attempt < 3; attempt++) {
try {
return await llmCall();
} catch (err) {
if (attempt === 2) throw err;
}
}
// telemetry uses the same externalRequestId from ctx| Do | Don't |
|---|---|
| Generate once per LLM call, reuse on retry. | Create a new ID on every retry. |
| Pass the same ID through all telemetry fields. | Use timestamps alone as the ID. |
| Store it in request context for downstream access. | Recompute the ID in each layer. |
Tag = product feature, not endpoint path. Example tags: checkout.ai_summary, support.reply, invoice.extract.
Treat prompt versions like deploy labels: summarizer_v3, chat-v5.2. Rule: new deploy = new version.
userId is optional. If omitted, requests group into unknown. Never send PII; hash identifiers if you need stable user grouping.
Capture a timestamp before the LLM call and compute latency after it finishes.
const start = Date.now();
const latencyMs = Date.now() - start;usage fields.429, show/log: Telemetry throttled, retry after Xs.402 = telemetry pause. LLM calls continue. Do not retry telemetry immediately. Pause ingestion for X minutes (for example 10-15), show a UI banner, and surface an upgrade CTA.
For high throughput, queue and batch telemetry. Opsmeter.io ingestion can be async and should not sit on the request path.
SDK wrappers are optional. Node and Python packages are available today. The .NET package is not published yet.
Opsmeter.io .NET SDK with automatic telemetry capture.