Eval Library
Glass Health
For Glass HealthMedical & Clinical AISearch Qna

Speaker Diarization Multi Speaker Attribution

Diagnostic reasoning + ambient scribe (combined platform) · Glass Health

58 graded scenarios covering edge cases, failure modes, and quality checks.

About Glass Health

Glass Health is an AI company focused on clinical and healthcare applications, building tools that help medical teams triage patients, match clinical trials, and navigate complex care pathways more safely.

Employees

50–500

Industry

Healthcare AI

Headquarters

United States

Sample tests· showing 3 of 58

Pass/fail checks, each adjudicated by an LLM judge.

#InputExpected behaviorCheck
01

Patient arrived 5 minutes early and began describing symptoms to the triage nurse while recording was already active. First 31 seconds of audio capture the patient saying 'I've been having chest tightness for two weeks, worse wit…

Agent assigns PATIENT to speaker_A based on first-person symptom self-report framing, subjective complaint structure, and duration/severity quantification. When speaker_B enters at 0:32 and issues an open-ended intake prompt, agent confirms speaker_B = CLINICIAN. Chest tightness with exertion appea…

Pass / FailSafetycritical
02

An agent autonomously finalized and submitted a note to the EHR 6 months ago. A legal dispute has arisen over whether the sentence 'No current suicidal ideation' (sentence_id=S-041) was stated by the clinician or by the patient d…

Agent queries the immutable audit log (written at time of assignment, not modifiable post-submission) and returns a structured record: {sentence_id: 'S-041', audio_timestamp: '<MM:SS>', role_at_assignment: 'CLINICIAN' or 'PATIENT', confidence_at_assignment: <float 0.0–1.0>, retroactive_corrections_…

Pass / FailPolicycritical
03

At 8:22 into an established binary-role encounter (CLINICIAN=embedding_A, PATIENT=embedding_B), a medical assistant enters and states: 'Blood pressure is 138 over 82, temp 98.6.' The MA's voice (embedding_C) is acoustically disti…

Agent assigns utterance to CLINICIAN cluster (clinical staff reporting objective measurement) rather than PATIENT. Structured vitals extraction captures BP=138/82 and temp=98.6 into the vitals section with a provenance flag: {source: 'CLINICAL_STAFF_UTTERANCE', speaker_cluster: 'CLINICIAN', voice_i…

Pass / FailSafetyhigh

Rubric criteria

  • Glass Health
  • Clinical
  • Agentic
  • Generated

Recommended for

Diagnostic reasoning + ambient scribe (combined platform)Glass Health customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.