For AssemblyAIAI Platform

Transcript Features

AssemblyAI (Universal-2 + LeMUR) · AssemblyAI

Speech AI Platform — AssemblyAI

Evaluates AssemblyAI's Transcript Features across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Speech AI Platform eval coverage.

About AssemblyAI

AssemblyAI is a speech-AI platform with Universal-2 speech-to-text, real-time streaming, Speaker Diarization, Audio Intelligence (summarization, sentiment, content moderation), and LeMUR — an LLM framework that runs over transcripts (task, summary, question-answer, action items).

Employees

~150

Industry

Speech AI

Headquarters

San Francisco, CA

Website

www.assemblyai.com

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	Downstream subtitling pipeline expects punctuated/cased text. Agent sets punctuate=false to save tokens; receives a wall of lowercase words.	Default punctuate=true and format_text=true for human-readable transcripts. Disable only for downstream NLP that re-tokenizes (e.g., custom punctuation models). Verify text shape on a few samples before pinning the choice.	Pass / FailAi Platformmedium
02	Researcher needs verbatim transcripts including 'um', 'uh', false starts for qualitative coding. Sets disfluencies=true.	disfluencies=true preserves um/uh/false-starts in the transcript. For clean subtitles set disfluencies=false (default). Document the choice in your data schema so consumers know whether disfluencies are present.	Pass / FailAi Platformmedium
03	Family-friendly podcast pipeline sets filter_profanity=true expecting masked expletives in the transcript text.	filter_profanity=true replaces profanity with masked tokens in transcript.text (typical mask: f***). Audio is unchanged; redaction is text-only. For audio removal use redact_pii_audio (different surface). Verify the mask shape on sample utterances.	Pass / FailAi Platformmedium
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Assemblyai
Ai Platform
Transcript Features

Recommended for

AssemblyAI (Universal-2 + LeMUR)AssemblyAI customers

Works with

AssemblyAI

Related evals

AI Platform

Claude API

Evaluates Anthropic's Batch API across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Extended Thinking across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Files API & Citations across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View

Frequently asked questions

What does the Transcript Features eval for AssemblyAI AssemblyAI (Universal-2 + LeMUR) test?+

Evaluates AssemblyAI's Transcript Features across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Speech AI Platform eval coverage.

How is the Transcript Features eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Transcript Features pack for AssemblyAI AssemblyAI (Universal-2 + LeMUR) contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Transcript Features pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.