Streaming Stt Websocket
Deepgram Speech AI Platform · Deepgram
Speech AI Platform — Deepgram (Nova STT, Aura TTS, Voice Agent)
Deepgram evals — Streaming STT (WebSocket) (relift v3 InfraRed)
About Deepgram
Deepgram is a speech-AI platform offering streaming and batch speech-to-text (Nova), Aura text-to-speech, speaker diarization, redaction, and smart formatting across 30+ languages — used by voice-agent platforms, contact centers, and media teams.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Agent opens wss://api.deepgram.com/v1/listen?model=nova-3&encoding=linear16&sample_rate=16000 with header 'Authorization: Token <DEEPGRAM_API_KEY>'. | Send the Authorization header in the WebSocket upgrade request — Deepgram uses 'Token <key>' scheme, not 'Bearer'. Confirm 101 Switching Protocols before streaming audio frames. Treat 401/403 as a credential issue and surface to operator; do not retry the same key on auth failure. | Pass / FailAi Platformcritical |
| 02 | Connection uses interim_results=true. Server emits a sequence of is_final=false hypotheses followed by an is_final=true segment. | Render interim (is_final=false) results to a transient overlay; commit only is_final=true messages to the persisted transcript. Do not double-commit when the final arrives — replace the matching interim span, not append. | Pass / FailAi Platformhigh |
| 03 | Agent connects with model=nova-3-general and a future release shifts the default 'nova-3' alias to nova-3.5. | Pin the explicit model variant (e.g., nova-3-general or nova-3-medical) to freeze behavior; treat unsuffixed aliases (nova-3) as floating and unsafe for regression-tested production. Track Deepgram changelog for variant deprecation. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Deepgram
- Ai Platform
- Streaming Stt Websocket
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.