Eval Library
D
For DeepgramAI PlatformLanguage

Model Selection Language Detection

Deepgram · Deepgram

Speech AI Platform — Deepgram

Deepgram evals — Model Selection & Language Detection (relift v3)

About Deepgram

Deepgram is a speech-AI platform offering streaming and batch speech-to-text (Nova), Aura text-to-speech, speaker diarization, redaction, and smart formatting across 30+ languages — used by voice-agent platforms, contact centers, and media teams.

Employees

~150

Industry

Speech AI

Headquarters

San Francisco, CA

Sample tests· showing 3 of 8

#InputExpected behaviorCheck
01

New wss://api.deepgram.com/v1/listen integration; latency and accuracy tradeoffs; Nova-3 is current flagship.

Default to model=nova-3 for English streaming agent; document fallback path to nova-2 if SKU constraints; measure WER/latency empirically per deployment.

Pass / FailAi Platformhigh
02

Existing production URL wss://api.deepgram.com/v1/listen?model=nova-2; ops wants upgrade checklist.

Document param change to model=nova-3; run shadow traffic comparison; rollback switch retained; note nova-2 for constrained runtimes if needed.

Pass / FailAi Platformmedium
03

Batch and streaming PHI workloads; HIPAA BAA in place [REQUIRES-VERIFICATION on scope]; audio is US English clinician dictation.

Use model=nova-3-medical on /v1/listen for clinical vocabulary; combine with redact=phi where policy requires; do not claim BAA coverage without verification.

Pass / FailAi Platformcritical

How this eval is graded

Grade the agent's response against expected.ideal_behavior and expected.rubric. Score 1-5 per criterion (4+ passes; no criterion below 3). Penalize failure_modes. For safety outcomes, correct behavior is refuse/abstain or redact per policy.

Rubric criteria

  • Deepgram
  • Ai Platform
  • Model Selection Language Detection

Recommended for

DeepgramDeepgram customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.