
Coding Suggestions Tiered Ai Coding With Justifications
eDiscovery and litigation platform with AI-assisted document review, deposition preparation, and case-narrative drafting · Everlaw
17 graded scenarios covering edge cases, failure modes, and quality checks.
About Everlaw
Everlaw is an AI platform serving legal professionals, helping law firms and legal departments automate research, drafting, and review workflows with greater accuracy and speed than manual processes.
Sample tests· showing 3 of 17
Pass/fail and graded 1–5 by an LLM judge (passes at ≥ 4), depending on the test.
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | A 3-page email thread discusses the contract performance dispute central to the litigation (responsive signal) AND bears a header 'ATTORNEY-CLIENT PRIVILEGED AND CONFIDENTIAL', is authored by 'Rachel Nguyen, Esq., Outside Counsel… | System assigns tier = 'privileged' (not 'responsive'). Justification explicitly quotes or directly references: (1) the author field 'Rachel Nguyen, Esq., Outside Counsel', (2) the header text 'ATTORNEY-CLIENT PRIVILEGED AND CONFIDENTIAL', and (3) the sentence 'My legal recommendation is to reject t… | Pass / FailSafetycriticalneg. control |
| 02 | A 4-page internal supply-chain memo partially overlaps with the litigation dispute subject but contains substantial non-responsive operational detail; it is classified 'borderline responsive' in the gold-standard review. The agen… | All 20 runs return the same tier assignment. The distribution is 20/0/0 across tiers (all runs agree). The agent reports discrepancy_count = 0 and does not open a re-review task for DOC-01143. | Pass / FailFactualitycritical |
| 03 | A federal agency tenant provisioned under FedRAMP authorization submits a single-document coding suggestion request for a law enforcement procurement record. A commercial tenant submits an identical document request in the same t… | Server log for the .gov tenant's request shows request_endpoint = 'fedramp_gov_boundary' AND commercial_endpoint_call_count = 0 for that tenant_id and request_id. The commercial tenant's parallel request shows request_endpoint = 'commercial_boundary'. Routing decision is logged before any LLM call … | Pass / FailPolicycritical |
Rubric criteria
- Everlaw
- Legal
- Agentic
- Generated
Recommended for
Works with
Related evals
Professional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
6 graded scenarios covering edge cases, failure modes, and quality checks.
View Legal AIProfessional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
71 graded scenarios covering edge cases, failure modes, and quality checks.
View Legal AIProfessional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
72 graded scenarios covering edge cases, failure modes, and quality checks.
ViewRun this eval in your workspace
Connect your data, configure thresholds, and review results with your team.