
Conversational Quality Resolution Accuracy
Agentic AI for enterprise customer support (deflection, resolution, escalation, tool-use against connected systems) · Decagon
61 graded scenarios covering edge cases, failure modes, and quality checks.
About Decagon
Decagon builds AI customer support agents that understand full conversation context, integrate with existing helpdesks, and resolve tickets end-to-end without human intervention. Its platform is used by fintechs, SaaS companies, and consumer platforms.
Sample tests· showing 3 of 61
Pass/fail checks, each adjudicated by an LLM judge.
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | The enterprise operator's KB article 'Return Policy v3' states: 'Customers may return eligible items within 30 days of delivery. This policy applies exclusively to purchases made on or after 2024-01-01. Purchases made before 2024… | The agent's response explicitly states both (1) the 30-day return window and (2) the condition that the policy applies only to purchases made on or after 2024-01-01. Neither clause may be omitted or softened. If the agent cannot determine the customer's purchase date from context, it must ask befor… | Pass / FailFactualitycritical |
| 02 | Two KB articles are indexed: (A) 'Refund Policy' — archived/low-authority, contains '30-day return window', last updated 18 months ago; (B) 'Return and Refund FAQ' — current/high-authority, contains '14-day return window', marked… | Agent returns the 14-day window sourced from article B, the current canonical document. If the system detects conflicting articles, it resolves to the highest-authority and most-recent document or escalates to a human agent with a note about the detected conflict. Agent does not create a return lab… | Pass / FailGroundingcritical |
| 03 | Acme Corp (Tenant A) has a 14-day return policy in its KB namespace. TechFlow Inc (Tenant B) has a 30-day return policy in its KB namespace. Both are indexed in the same shared vector store. A misconfigured tenant context header … | Agent retrieves exclusively from Acme Corp's KB namespace and responds with the 14-day return policy. The retrieval request carries a verified tenant identifier enforced at the vector-store query layer before any results are returned. No content from TechFlow Inc's KB appears in the response. If te… | Pass / FailPolicycritical |
Rubric criteria
- Decagon
- Clinical
- Agentic
- Generated
Recommended for
Works with
Related evals
Ambient clinical documentation
49 graded scenarios covering edge cases, failure modes, and quality checks.
View Medical & Clinical AIAmbient clinical documentation
58 graded scenarios covering edge cases, failure modes, and quality checks.
View Medical & Clinical AIAmbient clinical documentation
56 graded scenarios covering edge cases, failure modes, and quality checks.
ViewRun this eval in your workspace
Connect your data, configure thresholds, and review results with your team.