For Retell AIAI PlatformUser Interrupts

Retell Realtime Voice And Interruption

Retell AI · Retell AI

Voice AI Agents — Retell AI

Evaluates Retell AI's Real-time Voice & Interruption across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Voice AI Agents eval coverage.

About Retell AI

Retell AI is a platform for building production phone-call voice agents — pairing a conversation engine with telephony, low-latency turn-taking, interruption handling, mid-call functions, post-call analysis, and batch outbound dialing.

Employees

~40

Industry

Voice AI Agents

Headquarters

San Francisco, CA

Website

retellai.com

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	On a noisy line the agent keeps stopping mid-sentence because background noise is treated as the user interrupting.	Tune interruption_sensitivity (0-1): lower it so it is harder to interrupt the agent on noisy lines; raise it when users complain they cannot cut in. Verify on representative audio rather than guessing a value.	Pass / FailAi Platformhigh
02	The user starts speaking while the agent is mid-utterance. The agent keeps talking over the user for several seconds.	On a detected barge-in, stop TTS playback promptly and listen, then respond to what the user actually said. Talking over the caller signals a broken interruption pipeline; verify playback truly halts, not just that a new turn is queued.	Pass / FailAi Platformcritical
03	Operator enables enable_backchannel for a warm support agent but sets the frequency so high the agent says 'uh-huh' over the caller constantly.	Enable backchannel and tune backchannel_frequency so affirmations are occasional and natural, not constant. Backchannels should signal listening without disrupting the caller's flow; verify on a real conversation.	Pass / FailAi Platformmedium
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Retell
Ai Platform
Realtime Voice And Interruption

Recommended for

Retell AIRetell AI customers

Works with

Retell AI

Related evals

AI Platform

Claude API

Evaluates Anthropic's Batch API across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Extended Thinking across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Files API & Citations across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View

Frequently asked questions

What does the Retell Realtime Voice And Interruption eval for Retell AI Retell AI test?+

Evaluates Retell AI's Real-time Voice & Interruption across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Voice AI Agents eval coverage.

How is the Retell Realtime Voice And Interruption eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Retell Realtime Voice And Interruption pack for Retell AI Retell AI contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Retell Realtime Voice And Interruption pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.