For LexisNexisLegal AIDoc AgentAnswer Relevance

Retrieval Augmented Generation Pipeline Five Stage Prompt Checking

Lexis+ AI and Protégé — conversational legal research, drafting, summarization, and document analysis grounded in LexisNexis authoritative content and Shepard's Citations · LexisNexis

47 graded scenarios covering edge cases, failure modes, and quality checks.

About LexisNexis

LexisNexis is RELX's legal and professional information and analytics business. Its legal-AI portfolio includes Lexis+ with Protege, which combines legal content with research, drafting, and analysis workflows.

Employees

11,900

Industry

Information and Analytics / Legal Technology

Website

www.lexisnexis.com/en-us/about-us/about-us.page

Sample tests· showing 3 of 47

Pass/fail and graded 1–5 by an LLM judge, depending on the test.

#	Input	Expected behavior	Check
01	An agent performing citation verification calls the lexical retrieval tool using the USCA abbreviation ('42 U.S.C.A. § 1983'). The U.S.C.A. corpus chunk contains both the enacted statutory text and West editorial keynotes in sequ…	The returned chunk clearly separates enacted statutory text (labeled '[Enacted statutory text — 42 U.S.C.A. § 1983]') from West editorial annotations (labeled '[West editorial annotation — not statutory text]'). The agent's downstream extraction draws only from the labeled enacted-text segment. The…	Pass / FailFactualitycritical
02	An agent is autonomously generating a HIPAA compliance memo and issues a citation lookup for the expert-determination de-identification safe harbor at 45 C.F.R. § 164.514(b)(2)(ii). The index stores the regulation at multiple gra…	The lexical stage tokenizes the citation as title=45, part=164, section=514, paragraph chain=(b)(2)(ii) and returns only the text of that sub-paragraph (conditions the expert must satisfy). It does not return (b)(1) statistical safe harbor text ('no more than 1 in 1000' threshold), the full § 164.5…	Pass / FailGroundingcritical
03	An agent is verifying a list of citations from opposing counsel's brief. One citation, '42 U.S.C. § 9999', is syntactically valid (correct title/section-number format) but does not correspond to an enacted provision in Title 42. …	The lexical stage returns a definitive NO-MATCH signal: '42 U.S.C. § 9999 was not found in the LexisNexis corpus.' The agent propagates this signal and marks the citation in its output as 'NOT VERIFIED — section does not exist.' If semantic fallback is invoked, its output is labeled '[Semantically …	Pass / FailGroundingcriticalneg. control
Unlock full benchmark 44 more test cases Use this benchmark

How this eval is graded

Pass/fail and graded 1–5 by an LLM judge, depending on the test.

Grade the agent's response against EACH criterion in expected.criteria independently (PASS/FAIL per criterion, using each criterion's match_criteria). The case passes only if EVERY criterion passes (all-pass) — partial completion fails. For negative criteria (is_negative=true), PASS means the agent did NOT exhibit the described behavior.

Rubric criteria

Lexisnexis
Legal
Agentic
Lab-grade
All-pass

Recommended for

Lexis+ AI and Protégé — conversational legal research, drafting, summarization, and document analysis grounded in LexisNexis authoritative content and Shepard's CitationsLexisNexis customers

All-pass criteriaevery check must pass

10 checks · sample case

C-001PASS if the returned chunk contains a distinct label marking the enacted statutory text segment (e.g., '[Enacted statutory text — 42 U.S.C.A. § 1983]'). FAIL if the enacted statutory text appears without an explicit content-type label identifying it as enacted text.
C-002PASS if every West editorial annotation (keynote, headnote, or editorial commentary) in the returned chunk carries a label identifying it as editorial material that is not statutory text (e.g., '[West editorial annotation — not statutory text]'). FAIL if any editorial annotation appears without such a label.
C-003PASS if the statutory language the agent extracts for the brief is sourced exclusively from the segment labeled as enacted statutory text. FAIL if the extraction includes any content originating outside the labeled enacted-text segment.
C-004PASS if the agent's response includes an explicit note that the U.S.C.A. is an annotated code containing West editorial material that is not part of the enacted statute. FAIL if no such disclosure appears anywhere in the response.
C-005 · NEGPASS if no West keynote (e.g., a summary of a court holding) is presented or quoted in the agent's output as the text of 42 U.S.C.A. § 1983. FAIL if any keynote or case-holding summary is placed inside or adjacent to the enacted-text segment without a label, or is quoted in the brief-ready extraction as statutory language.
C-006 · NEGPASS if the returned chunk preserves machine-readable content-type labels distinguishing enacted text from editorial material. FAIL if all content-type labels have been stripped such that enacted text and editorial annotations cannot be distinguished by label.
C-007 · NEGPASS if the annotations returned are West U.S.C.A. annotations, or, if U.S.C.S. (LexisNexis) annotations are returned instead, the substitution is explicitly disclosed to the user. FAIL if U.S.C.S. annotations are returned in place of the requested U.S.C.A. annotations with no disclosure of the swap.
C-008 · NEGPASS if the enacted statutory text and any keynotes in the returned chunk are separated by an explicit delimiter or labeled boundary. FAIL if enacted text and keynotes are concatenated as one undifferentiated block with no delimiter between them.
C-009 · NEGPASS if the agent selects content for extraction based on its content-type labeling rather than its position in the chunk. FAIL if the agent uses the first substantive paragraph of the chunk as the statutory text regardless of whether that paragraph is labeled as enacted text or editorial material.
C-010PASS if the retrieved chunk and the agent's extraction correspond to 42 U.S.C.A. § 1983 as requested. FAIL if the agent retrieves or extracts text attributed to a different statute or section.

Works with

LexisNexis

Related evals

Legal AI

Professional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)

6 graded scenarios covering edge cases, failure modes, and quality checks.

View Legal AI

Professional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)

65 graded scenarios covering edge cases, failure modes, and quality checks.

View Legal AI

Professional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)

46 graded scenarios covering edge cases, failure modes, and quality checks.

View

Frequently asked questions

What does the Retrieval Augmented Generation Pipeline Five Stage Prompt Checking eval for LexisNexis Lexis+ AI and Protégé — conversational legal research, drafting, summarization, and document analysis grounded in LexisNexis authoritative content and Shepard's Citations test?+

47 graded scenarios covering edge cases, failure modes, and quality checks.

How is the Retrieval Augmented Generation Pipeline Five Stage Prompt Checking eval scored?+

Pass/fail and graded 1–5 by an LLM judge, depending on the test. The judge rubric: Grade the agent's response against EACH criterion in expected.criteria independently (PASS/FAIL per criterion, using each criterion's match_criteria). The case passes only if EVERY criterion passes (all-pass) — partial completion fails. For negative criteria (is_negative=true), PASS means the agent did NOT exhibit the described behavior.

How many test cases does this eval pack include?+

The Retrieval Augmented Generation Pipeline Five Stage Prompt Checking pack for LexisNexis Lexis+ AI and Protégé — conversational legal research, drafting, summarization, and document analysis grounded in LexisNexis authoritative content and Shepard's Citations contains 47 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Retrieval Augmented Generation Pipeline Five Stage Prompt Checking pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.