Use case
LLM cost attribution for sales copilots
Sales copilots can look healthy in usage metrics while hiding high-cost workflows. Attribution by feature and tenant makes spend controllable.
Full guide: Cost attribution by use-case: templates for real apps
High-impact feature tags
- sales.email_draft
- sales.proposal_summary
- sales.crm_followup
- sales.pipeline_risk_brief
Controls to add early
- Set per-feature max-token defaults by business value.
- Alert on top-tenant concentration changes.
- Review promptVersion drift after release cycles.
Dashboards and KPIs that map to revenue outcomes
- cost/request by endpointTag (drafting vs summarization vs enrichment)
- top tenants by spend and cost per workflow
- token-per-successful outcome (e.g., per sent email or proposal)
- burn forecast versus sales activity volume
- promptVersion regressions after release cycles
Model the sales copilot as workflow stages
Sales copilots are rarely one call. They usually retrieve CRM context, draft, rewrite, and sometimes score risk. Track the full chain.
Stage-level cost attribution prevents “invisible” spend in intermediate steps like CRM enrichment and rewriting.
- Stage 1: context retrieval (CRM, account notes, pipeline)
- Stage 2: drafting (email/proposal/summary)
- Stage 3: rewriting (tone, length, compliance)
- Stage 4: scoring (risk, priority, next actions)
Tool output bloat is a common hidden driver
CRM payloads can be huge. If you inject raw records into prompts, inputTokens will grow fast and stay high.
Summarize tool outputs before reinjection and cap tool payload size for predictable spend.
- Summarize account notes into short bullet points before drafting.
- Cap the number of CRM fields included per request.
- Cache stable context (account profile) for repeated requests.
- Measure cost per tool call and per workflow stage.
Guardrails that keep costs predictable
- Cap output tokens for drafts and rewrites by endpointTag.
- Route low-risk rewrites to cheaper models (after you cap the first pass).
- Alert on tenant concentration for high-usage customers.
- Gate promptVersion rollouts when token deltas exceed threshold.
- Track retries and fallbacks as cost multipliers.
Common mistakes
- Measuring total spend without mapping to endpoints and workflow outcomes.
- Letting rewriting loops run unbounded (draft -> rewrite -> rewrite).
- Shipping prompt changes without promptVersion tagging.
- Mixing demo/test traffic into production finance reporting.
What to send (payload example)
{
"externalRequestId": "req_01HZXB6MQZ2WQ9D2KCF9M4V2QY",
"provider": "provider_id",
"model": "model_id",
"endpointTag": "sales.email_draft",
"promptVersion": "sales_v3",
"userId": "tenant_acme_hash",
"inputTokens": 740,
"outputTokens": 320,
"latencyMs": 892,
"status": "success",
"dataMode": "real",
"environment": "prod"
}Common mistakes
- Missing endpointTag or using inconsistent naming across teams.
- Not tagging promptVersion, so deploys cannot be linked to spend changes.
- Sending raw user identifiers instead of hashed mapping for privacy.
- Mixing demo/test dataMode into production operational reviews.
How to verify in Opsmeter Dashboard
- Use Overview to confirm spike window and budget posture.
- Use Top Endpoints to find feature-level concentration.
- Use Top Users to find tenant-level concentration.
- Use Prompt Versions to validate deploy-linked cost drift.
Related guides
Evaluation resources
For security and procurement reviews, use our trust summary before final tool selection.