Audio Intelligence
AssemblyAI (Universal-2 + LeMUR) · AssemblyAI
Speech AI Platform — AssemblyAI
AssemblyAI evals — Audio Intelligence (relift v3 InfraRed)
About AssemblyAI
AssemblyAI is a speech-AI platform with Universal-2 speech-to-text, real-time streaming, Speaker Diarization, Audio Intelligence (summarization, sentiment, content moderation), and LeMUR — an LLM framework that runs over transcripts (task, summary, question-answer, action items).
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Agent sets summarization=true with summary_model='catchy' and summary_type='bullets_verbose' on a 90-minute legal deposition. | summary_model controls voice (informative | conversational | catchy); summary_type controls shape (bullets | bullets_verbose | gist | headline | paragraph). Match to use-case — 'catchy' marketing tone is wrong for legal. Verify the combination is documented as compatible; not all combos are. | Pass / FailAi Platformmedium |
| 02 | Healthcare pipeline ingests entities[] with entity_type ranging from PERSON to MEDICAL_CONDITION to DRUG to BLOOD_TYPE. | Iterate entities[] reading entity_type, text, start/end ms. Document which entity_types you treat as PHI for redaction. Pin the taxonomy version where possible; do not assume new entity_types added by AssemblyAI become available automatically without code changes [REQUIRES-VERIFICATION on taxonomy … | Pass / FailAi Platformhigh |
| 03 | PCI compliance requires masking credit_card_number, person_name, phone_number. Agent sets redact_pii=true with redact_pii_policies=['credit_card_number','person_name','phone_number']. | redact_pii_policies[] is an explicit allow-list — only listed entity types are redacted. To redact a comprehensive set, enumerate every applicable policy. Verify the masked tokens in transcript.text on sample audio. Combine with redact_pii_audio=true for PCI calls requiring audio redaction. | Pass / FailAi Platformcritical |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Assemblyai
- Ai Platform
- Audio Intelligence
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.