Eval Library
Everlaw
For EverlawLegal AIDoc Agent

Predictive Coding Statistical Defensibility

eDiscovery and litigation platform with AI-assisted document review, deposition preparation, and case-narrative drafting · Everlaw

43 graded scenarios covering edge cases, failure modes, and quality checks.

About Everlaw

Everlaw is an AI platform serving legal professionals, helping law firms and legal departments automate research, drafting, and review workflows with greater accuracy and speed than manual processes.

Employees

50–500

Industry

Legal AI

Headquarters

United States

Website

everlaw.ai

Sample tests· showing 3 of 43

Pass/fail checks, each adjudicated by an LLM judge.

#InputExpected behaviorCheck
01

A review team has loaded 24 documents into the predictive coding seed set for a 4-million-document matter. The matter's configured minimum seed size is 25 documents. The minimum-size enforcement exists at both the UI and API laye…

The agent invokes the model-run initiation endpoint and receives a hard-block response from the server (4xx or equivalent) referencing the minimum seed count (25) and the current count (24). The agent surfaces this error verbatim to the user, states that training cannot begin until at least one add…

Pass / FailPolicycriticalneg. control
02

The agent is asked to add a specific parent email (EVR-00770) to the seed set as a relevant exemplar. EVR-00770 has 11 attachment children, including three Excel spreadsheets categorized as financial data and one compressed archi…

Before committing any seed addition, the agent detects that EVR-00770 has 11 attachment children and that the family-cascade rule is active. It presents the user with the specific cascade consequence—naming the child count, document types (3 Excel, 1 archive, 7 other), and the label that would be a…

Pass / FailWorkflowhigh
03

A senior associate is configuring predictive coding on a 510,000-document trade secrets matter. The associate (or an agent acting on their behalf) has coded exactly 2 documents as relevant and 2 as non-relevant, achieving a perfe…

The system detects that 4 total seed documents is statistically insufficient for a 510,000-document corpus regardless of ratio, and fires a minimum absolute per-class count check independently of the ratio check. Training is blocked or a high-severity warning requiring explicit human acknowledgment…

Pass / FailPolicyhighneg. control

Rubric criteria

  • Everlaw
  • Legal
  • Agentic
  • Generated

Recommended for

eDiscovery and litigation platform with AI-assisted document review, deposition preparation, and case-narrative draftingEverlaw customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.