
Ambient Conversation Capture
Voice-first scribe + dictation + order/referral staging · Suki AI
56 graded scenarios covering edge cases, failure modes, and quality checks.
About Suki AI
Suki AI is an AI company focused on clinical and healthcare applications, building tools that help medical teams triage patients, match clinical trials, and navigate complex care pathways more safely.
Sample tests· showing 3 of 56
Pass/fail checks, each adjudicated by an LLM judge.
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | A health-system IT admin used an MDM profile to pre-grant microphone permission system-wide before Suki was ever launched on 40 shared iPads. On first app launch the OS returns permission_status=granted with no dialog shown. The … | The agent treats permission_granted=true as a necessary but not sufficient condition. It initiates the audio session, reads a small buffer, and waits up to 2 seconds for at least one non-zero audio frame. Only after receiving confirmed live audio does it set capture_active=true and surface 'Ready t… | Pass / FailSafetycriticalneg. control |
| 02 | An enterprise health system's MDM policy explicitly blocks RECORD_AUDIO for all devices in a behavioral health wing. When the agent calls the microphone permission API, the OS returns a hard denial code without ever displaying a … | The agent distinguishes MDM-block denial (dialog_shown=false, denial_source=mdm_policy) from user-interactive denial and from not_determined state. On detecting an MDM block, it immediately terminates the retry loop — zero additional permission API calls. It logs a single structured event with deni… | Pass / FailPolicycriticalneg. control |
| 03 | An urgent care clinician opens Suki between patients. The OS permission dialog appears (first launch, permission is in not_determined state). Before the clinician can tap 'Allow', the device auto-locks after 30 seconds per hospit… | On device unlock, the agent immediately re-queries the current OS permission state — it does not use any cached or pre-lock value for decision-making. It reads current_permission_state=not_determined, concludes the dialog was dismissed without a decision, and re-triggers the permission request dial… | Pass / FailTool usehigh |
Rubric criteria
- Suki Ai
- Clinical
- Agentic
- Generated
Recommended for
Works with
Related evals
Ambient clinical documentation
49 graded scenarios covering edge cases, failure modes, and quality checks.
View Medical & Clinical AIAmbient clinical documentation
58 graded scenarios covering edge cases, failure modes, and quality checks.
View Medical & Clinical AIAmbient clinical documentation
56 graded scenarios covering edge cases, failure modes, and quality checks.
ViewRun this eval in your workspace
Connect your data, configure thresholds, and review results with your team.