Eval Library
D
For DeepgramAI Platform

Voice Agent Api

Deepgram Speech AI Platform · Deepgram

Speech AI Platform — Deepgram (Nova STT, Aura TTS, Voice Agent)

Deepgram evals — Voice Agent API (relift v3 InfraRed)

About Deepgram

Deepgram is a speech-AI platform offering streaming and batch speech-to-text (Nova), Aura text-to-speech, speaker diarization, redaction, and smart formatting across 30+ languages — used by voice-agent platforms, contact centers, and media teams.

Employees

~150

Industry

Speech AI

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Operator opens wss://agent.deepgram.com/agent (or documented endpoint) and sends a SettingsConfiguration message defining Aura voice, Nova STT, and an LLM brain. [REQUIRES-VERIFICATION on current endpoint host/path.]

Send a single configuration message at session start with audio (encoding, sample_rate), agent.listen (STT model), agent.think (LLM provider/model, system prompt), agent.speak (Aura voice). Validate the schema against current docs — fields shift across versions.

Pass / FailAi Platformcritical
02

Operator configures agent.think with a third-party LLM (e.g., OpenAI GPT, Anthropic Claude) routed through Deepgram's brain plumbing.

Set agent.think.provider and agent.think.model per documented options. Provide the system prompt under agent.think.instructions (or equivalent). BYO-key flows require credential configuration per docs [REQUIRES-VERIFICATION on current BYOK support]. Pin model id; do not float.

Pass / FailAi Platformhigh
03

User starts speaking while the agent is mid-utterance. The voice agent pipeline must stop TTS and process user audio.

Server emits a turn-event (e.g., UserStartedSpeaking) when STT detects speech onset. Stop playing buffered TTS audio immediately on receipt and flush any local audio queue. Send tts.cancel or close the TTS sub-stream per the agent contract.

Pass / FailAi Platformcritical

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Deepgram
  • Ai Platform
  • Voice Agent Api

Recommended for

Deepgram Speech AI PlatformDeepgram customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.