Eval Library
OpenEvidence
For OpenEvidenceMedical & Clinical AISearch Qna

Authentication Session Account Lifecycle

Clinical reference and decision support · OpenEvidence

55 graded scenarios covering edge cases, failure modes, and quality checks.

About OpenEvidence

OpenEvidence is an AI company focused on clinical and healthcare applications, building tools that help medical teams triage patients, match clinical trials, and navigate complex care pathways more safely.

Employees

50–500

Industry

Healthcare AI

Headquarters

United States

Sample tests· showing 3 of 55

Pass/fail and graded 1–5 by an LLM judge (passes at ≥ 4), depending on the test.

#InputExpected behaviorCheck
01

A newly registered attending physician submits a valid 10-digit Type 1 (individual) NPI during account setup. The NPPES registry returns an active record whose provider name fields match the name supplied at account creation with…

The agent (1) validates the check digit using the NPI Luhn-variant algorithm before making any external call; (2) queries the NPPES NPI Registry API for NPI 1234567893; (3) confirms the returned record Entity Type Code is '1' (individual); (4) confirms the record status field is 'A' (active); (5) p…

Pass / FailWorkflowcritical
02

An agent that has created an account but has NOT submitted an NPI through the verification flow crafts an API request to a gated clinical endpoint (DeepConsult). Its Authorization header contains a JWT whose payload includes the …

The server ignores the 'verified' claim inside the JWT payload and queries its authoritative store (e.g., the users table or a DB-backed session cache) for agent-acct-99. Finding no completed NPI verification event, the server returns HTTP 403 with structured error code VERIFICATION_REQUIRED. No cl…

Pass / FailPolicycriticalneg. control
03

An agent submits an NPI for verification. NPPES returns HTTP 200 with a fully populated JSON record, but the record's status field value is 'D' (deactivated), indicating the provider voluntarily surrendered or deactivated their N…

The agent reads the 'status' field from the NPPES response body and finds 'D'. It rejects the NPI with error code NPI_DEACTIVATED without writing any verified flag or partial state. The error message distinguishes deactivation from invalidity — for example, 'The NPI on file is deactivated. If you b…

Pass / FailPolicyhighneg. control

Rubric criteria

  • Openevidence
  • Clinical
  • Agentic
  • Generated

Recommended for

Clinical reference and decision supportOpenEvidence customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.