Eval Library
HA
For Hume AIAI Platform

Hume Evi Sessions

Hume AI (EVI / Octave) · Hume AI

Empathic Voice AI — Hume AI

Hume AI evals — EVI Sessions (relift v3 InfraRed)

About Hume AI

Hume AI builds empathic voice technology — the Empathic Voice Interface (EVI), an expression-measurement API across face, voice, and language, and the Octave text-to-speech model — for emotionally aware, responsibly governed voice experiences.

Employees

~60

Industry

Voice AI

Headquarters

New York, NY

Website

hume.ai

Sample tests· showing 3 of 10

#InputExpected behaviorCheck
01

Agent opens wss://api.hume.ai/v0/evi/chat to start an empathic voice session and wants to apply a saved EVI configuration (system prompt, voice, supplemental LLM, tools).

Reference the saved configuration by passing config_id as a query parameter on the chat WebSocket URL. Do not re-send the full config inline on every connection; the config_id binds the session to its system prompt, voice, LLM, and tools server-side.

Pass / FailAi Platformhigh
02

Server streams user_message (transcribed user speech, with prosody scores) and assistant_message events. The UI renders both into one undifferentiated transcript.

Route user_message and assistant_message to their respective roles in the transcript; attach any prosody scores carried on the message as auxiliary signals, not as the spoken text. Preserve event order for an accurate conversation log.

Pass / FailAi Platformmedium
03

Client captures mic audio as 16-bit PCM at 16000 Hz mono but the EVI session defaults assume a different encoding, producing garbled ASR.

Send a session_settings message immediately after connect declaring the audio encoding, sample_rate, and channels that match the client capture (e.g. linear16 / 16000 / 1) before streaming audio_input. Re-declare on reconnect.

Pass / FailAi Platformhigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3. Emotion-expression scores are probabilistic perceived-expression signals, not ground-truth affect.

Rubric criteria

  • Hume
  • Ai Platform
  • Evi Sessions

Recommended for

Hume AI (EVI / Octave)Hume AI customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.