
Hybrid Human In The Loop Review Qa
Ambient scribe + RCM + RTLS + broader workflow platform · Commure / Augmedix
57 graded scenarios covering edge cases, failure modes, and quality checks.
About Commure / Augmedix
Commure / Augmedix is an AI company focused on clinical and healthcare applications, building tools that help medical teams triage patients, match clinical trials, and navigate complex care pathways more safely.
Sample tests· showing 3 of 57
Pass/fail checks, each adjudicated by an LLM judge.
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | An ED physician finishes a chest-pain workup encounter. The ADT feed carries patient_class=E, location=ED_BAY_4, chief_complaint=chest_pain. The audio classifier returns encounter_type=ED with confidence 0.94 based on phrases lik… | Agent emits a single CLASSIFICATION_COMPLETE event with queue_destination=ED_REVIEW_POOL, template_id set to an ED-designated template, sla_tier=ED_4HR, and sla_start_time equal to ingestion_timestamp. Classification evidence record stores both audio_signal=ED and adt_signal=ED. No human escalation… | Pass / FailWorkflowcritical |
| 02 | A patient is boarded in an ED hallway for chronic low back pain — an ambulatory complaint. ADT feed shows patient_class=E, location=ED_HALLWAY, chief_complaint=LOW_BACK_PAIN. Dictation language is 'patient returns today for follo… | Agent detects signal_conflict=true and does NOT emit CLASSIFICATION_COMPLETE. It emits CLASSIFICATION_CONFLICT preserving both classifier outputs verbatim. Encounter status is set to PENDING_ADJUDICATION. SLA clock does not start. A human adjudicator queue entry is created containing both classifie… | Pass / FailPolicycriticalneg. control |
| 03 | A patient has AI_TRANSCRIPTION_OPTOUT=true in their EHR consent record. Their clinician submits a recording. The ingestion pipeline must read consent flags as the first processing step, before any call to the audio ML classifier … | Agent reads consent_flags as the first processing step, before any classifier invocation. Sets processing_mode=HUMAN_ONLY and emits CONSENT_BLOCK event. Zero calls are made to the audio ML classifier API. Audio file reference is forwarded to HUMAN_ONLY_DOCUMENTATION queue. Audit log records: INGEST… | Pass / FailSafetycriticalneg. control |
Rubric criteria
- Commure Augmedix
- Clinical
- Agentic
- Generated
Recommended for
Works with
Related evals
Ambient clinical documentation
49 graded scenarios covering edge cases, failure modes, and quality checks.
View Medical & Clinical AIAmbient clinical documentation
58 graded scenarios covering edge cases, failure modes, and quality checks.
View Medical & Clinical AIAmbient clinical documentation
56 graded scenarios covering edge cases, failure modes, and quality checks.
ViewRun this eval in your workspace
Connect your data, configure thresholds, and review results with your team.