Zendesk
For ZendeskCustomer SupportSupport Bot

Ingest Painpoint Eval V2

Zendesk Support / Agent Workspace · Zendesk

Zendesk ingest eval pack, persona-balanced and source-traceable per test row.

About Zendesk

Zendesk is a customer service platform that helps businesses build better customer relationships. Its AI-powered products handle billions of support interactions across email, chat, voice, and messaging, giving agents the context they need to resolve issues faster.

Employees

~6,500

Industry

Customer Experience Software

Headquarters

San Francisco, CA

Sample tests· showing 3 of 100

#InputExpected behaviorCheck
01

Review this workflow signal for `Audit rule changes`: `Rule revision history allows admins to review and iterate trigger logic over time.`. Should it be flagged as friction?

should_flag_painpoint: false · painpoint_label: none · severity: none

Pass / Fail
02

A support_agent reports: "An agent accidentally sends an internal note as a public reply and exposes internal context to the customer." while using `Reply and resolve tickets`. Diagnose likely root cause.

should_flag_painpoint: true · painpoint_label: public_private_comment_misfire · severity: critical

Pass / Fail
03

Given this observation in `Operate at high ticket volume`: "When ticket volume is high, the interface feels cluttered and slower for agents triaging large queues.", decide whether to flag a pain point.

should_flag_painpoint: true · painpoint_label: ui_clutter_under_load · severity: medium

Pass / Fail

How this eval is graded

Score whether the model correctly identifies support workflow pain points. Evaluate: - Pain-point detection correctness - Severity calibration - Actionability of recommended fix - Persona-aware reasoning quality - Avoiding false positives on neutral observations

Pass threshold: a criterion passes at a judge score of 4 or higher.

Rubric criteria

  • Ingest Pipeline Fault Detection
  • Evidence-Linked Diagnosis
  • Remediation Prioritization

Recommended for

Ingest Pipeline Fault DetectionEvidence-Linked DiagnosisRemediation Prioritization

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.