Opsmeter logo
Opsmeter
AI Cost & Inference Control

Use case

LLM cost per support ticket: pricing and margin guide

Support teams need cost per resolved ticket, not only request totals. This guide maps spend to resolution outcomes.

Use caseOperations

Full guide: Cost attribution by use-case: templates for real apps

Metrics that matter

  • cost per resolved ticket
  • cost per escalation avoided
  • cost per tenant for high-volume support accounts
  • promptVersion impact on token efficiency

Pick one outcome definition (so numbers are comparable)

Support workflows have multiple “wins”: first response, deflection, resolution, or agent time saved. If you do not define the outcome, cost-per-ticket metrics will drift with workflow changes.

Choose one primary outcome per support flow and keep it stable across releases so weekly trends stay meaningful.

  • Resolved ticket: closed without human escalation (primary).
  • Deflected ticket: user solved issue via self-serve answer.
  • Escalation avoided: bot routed correctly and reduced agent workload.
  • Time saved: minutes saved per handled conversation (optional).

Support endpoint taxonomy (examples that scale)

  • support.reply (user-facing answer)
  • support.thread_summary (long-thread summarization)
  • support.ticket_draft (handoff draft for agents)
  • support.intent_route (routing and triage)
  • support.rag_answer (knowledge-base RAG answer)

Execution model

  1. Tag support endpoints separately from internal tooling.
  2. Track unknown user traffic and map to tenant when possible.
  3. Set budget thresholds before peak support windows.
  4. Review weekly export with support and finance leads.

How to compute cost per resolved ticket (simple, reliable)

  • ticketCost = sum(cost of all requests linked to the ticket workflow)
  • resolvedTicketCount = number of tickets resolved in the same window
  • costPerResolvedTicket = ticketCost / resolvedTicketCount
  • Include retries/fallback attempts in ticketCost (effective cost per outcome).

Pricing levers that protect support margins

  • Route low-value tickets to cheaper models or templated answers first.
  • Cap output tokens for auto-replies and long-thread summaries.
  • Use per-tenant budgets for high-volume customers.
  • Track cost per resolved ticket by promptVersion after releases.
  • Create an upgrade path for heavy usage instead of silently absorbing cost.

Alerts and guardrails (prevent support from becoming a cost sink)

  1. Alert on cost/request drift after promptVersion deploys (support.reply, support.thread_summary).
  2. Alert on tokens/hour bursts for public support channels (abuse and scraping).
  3. Cap thread history size and summarize older turns before reinjection.
  4. Add per-tenant budgets for high-volume accounts and document the escalation owner.
  5. Review p95/p99 token outliers weekly (long threads often dominate spend).

What to alert on

  • cost/request drift by endpointTag or promptVersion
  • unexpected tenant concentration in Top Users
  • request burst with falling success ratio
  • budget warning, spend-alert, and exceeded state transitions

Execution checklist

  1. Confirm spike type: volume, token, deploy, or abuse signal.
  2. Assign one incident owner and one communication channel.
  3. Apply immediate containment before deep optimization.
  4. Document the dominant endpoint, tenant, and promptVersion driver.
  5. Convert findings into one permanent guardrail update.

FAQ

Should support pricing be per ticket or per token?

Per ticket is easier for customers to understand and aligns with outcomes. Internally, track both: cost per resolved ticket for packaging and cost/request + tokens/request for engineering optimization.

Why do long support threads become expensive?

Conversation history gets re-sent every turn, so inputTokens grow linearly with thread length. Summarize older turns and cap context to prevent long-tail tickets from dominating spend.

Do we need per-tenant budgets for support?

If one tenant can generate a large share of tickets, yes. Per-tenant budgets prevent a single high-volume customer from draining shared margin and make escalation ownership clear.

Related guides

Open support use-case guideView pricingCompare alternatives

Evaluation resources

For security and procurement reviews, use our trust summary before final tool selection.

Open trust proof pack