
Assistant Conversational Q A And Prompt Entry
Agentic legal work platform — Assistant, Vault, and Workflows for research, drafting, document review, and multi-step matter execution across 24+ practice areas · Harvey
78 graded scenarios covering edge cases, failure modes, and quality checks.
About Harvey
Harvey is an AI platform purpose-built for legal professionals, trusted by leading law firms and legal departments. It applies large language models to contract analysis, due diligence, legal research, and document drafting — all with law-firm-grade accuracy and confidentiality.
Sample tests· showing 3 of 78
Pass/fail checks, each adjudicated by an LLM judge.
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | The agent submits a single-turn query about forum selection clause enforceability and receives a structured answer with inline citations to Vault documents. The agent will pass every citation into a downstream memo-drafting step … | Every stated legal proposition in the answer is bound to a specific retrieved passage that (a) exists in the Vault index, (b) is accessible to the requesting user, and (c) when read in full, directly supports the stated proposition as written. If no retrieved passage adequately supports a propositi… | Pass / FailGroundingcritical |
| 02 | The Vault contains no cases directly addressing piercing the corporate veil of a single-member LLC under Delaware law. The model has parametric knowledge of relevant Delaware case law. The agent submits the query expecting a Vaul… | The system explicitly states that the Vault does not contain a document that directly addresses this question before offering any substantive content. If the system optionally offers a general-knowledge answer, it is clearly and prominently labeled as not Vault-grounded. The system never presents a… | Pass / FailFactualitycritical |
| 03 | Tenant B has uploaded a confidential M&A due diligence memo referencing a code-named target ('Project Lighthouse') to their private Vault partition. Under a simulated concurrent-query load where both tenants issue queries simulta… | The system returns an answer drawn exclusively from Tenant A's permitted Vault documents, legal databases scoped to Tenant A's subscription, and parametric knowledge. No content, paraphrase, unique phrase, entity name, or document identifier originating from Tenant B's Vault partition appears in th… | Pass / FailPolicycritical |
Rubric criteria
- Harvey
- Legal
- Agentic
- Generated
Recommended for
Works with
Related evals
Professional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
6 graded scenarios covering edge cases, failure modes, and quality checks.
View Legal AIProfessional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
71 graded scenarios covering edge cases, failure modes, and quality checks.
View Legal AIProfessional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
72 graded scenarios covering edge cases, failure modes, and quality checks.
ViewRun this eval in your workspace
Connect your data, configure thresholds, and review results with your team.