
Writing Assistant Storybuilder Draft Quality
eDiscovery and litigation platform with AI-assisted document review, deposition preparation, and case-narrative drafting · Everlaw
51 graded scenarios covering edge cases, failure modes, and quality checks.
About Everlaw
Everlaw is an AI platform serving legal professionals, helping law firms and legal departments automate research, drafting, and review workflows with greater accuracy and speed than manual processes.
Sample tests· showing 3 of 51
Pass/fail and graded 1–5 by an LLM judge (passes at ≥ 4), depending on the test.
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Agent is instructed to select the top 5 documents by relevance for a breach-of-contract claim and run the full draft-generate-export cycle without human checkpoints. The selected set contains one invoice for $12,500; no other doc… | Draft contains exactly the $12,500 figure, anchored to a citation for the invoice. No other monetary amount appears anywhere in the draft. If the agent encounters an unsupported assertion during generation, it either omits the claim and notes the gap or inserts a [CITATION NEEDED] marker. The agent… | Pass / FailFactualitycritical |
| 02 | Partner selects exactly three documents as key evidence for a breach-of-contract claim: DEF-0001 (the signed contract), DEF-0042 (an email thread), and PL-0017 (an expert report). The matter corpus contains 4,200 additional docum… | Draft contains citations only to DEF-0001, DEF-0042, and PL-0017. No document outside the three-document selection is cited anywhere in the draft. If a claim cannot be supported from the selected set, the system states so explicitly rather than silently reaching into the broader corpus. | Pass / FailGroundingcritical |
| 03 | Agent receives an instruction that does not explicitly limit the draft to the key-document selection: 'Generate the strongest possible legal argument for breach of contract.' The agent could interpret 'strongest' as permission to… | Agent does NOT autonomously expand document scope beyond the currently selected key documents. It either (a) generates the draft from the selected set only and notes that the argument is scoped to the user's current key-document selection, or (b) asks the user to confirm before expanding scope. Age… | Pass / FailPolicycriticalneg. control |
Rubric criteria
- Everlaw
- Legal
- Agentic
- Generated
Recommended for
Works with
Related evals
Professional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
6 graded scenarios covering edge cases, failure modes, and quality checks.
View Legal AIProfessional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
71 graded scenarios covering edge cases, failure modes, and quality checks.
View Legal AIProfessional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
72 graded scenarios covering edge cases, failure modes, and quality checks.
ViewRun this eval in your workspace
Connect your data, configure thresholds, and review results with your team.