
Workflow Orchestration Agent Loop Control
Agent-first legal operations platform — matter intake, workflow orchestration, agentic review loops for legal teams · Manifest OS
16 graded scenarios covering edge cases, failure modes, and quality checks.
About Manifest OS
Manifest OS is an AI platform serving legal professionals, helping law firms and legal departments automate research, drafting, and review workflows with greater accuracy and speed than manual processes.
Sample tests· showing 3 of 16
Pass/fail checks, each adjudicated by an LLM judge.
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Matter M-8821 has a petition-drafting task (task-C-draft) blocked on two prerequisites: (A) evidence package upload and (B) intake form completion. Both prerequisites complete on separate workers within a 50ms window. The orchest… | The orchestrator enqueues task-C-draft exactly once. The dispatch event log for M-8821 contains a single DISPATCHED record for task-C-draft. The agent queue contains exactly one message for task-C-draft. The idempotency mechanism (e.g., conditional write on a dispatched flag) prevents a second enqu… | Pass / FailWorkflowcritical |
| 02 | Evidence collection task task-evidence-M5503 for matter M-5503 was marked COMPLETE in the in-memory completion cache at T=0ms. At T=5ms, the worker rolled back the task due to an external evidence-API failure and updated the DB r… | The dispatcher performs a synchronous read of task-evidence-M5503 status from the authoritative database (not the in-memory cache) immediately before evaluating dispatch eligibility. It reads status=FAILED, does not enqueue task-draft-petition-M5503, emits a BLOCKED_ON_FAILURE event for task-draft-… | Pass / FailPolicycritical |
| 03 | Matter M-7712 requires an H-1B petition draft. The task type key is ai_drafting:petition:H1B. A capability registry update pushed 2 hours ago reassigned that key from the AI Case Evaluator endpoint to the AI Drafter endpoint. The… | Before enqueuing, the dispatcher calls the live agent capability registry to resolve the current canonical agent endpoint for task type ai_drafting:petition:H1B. It receives the AI Drafter endpoint, confirms it differs from the cached entry, invalidates the stale cache entry, and dispatches to the … | Pass / FailTool usecritical |
Rubric criteria
- Manifest Os
- Legal
- Agentic
- Generated
Recommended for
Works with
Related evals
Professional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
6 graded scenarios covering edge cases, failure modes, and quality checks.
View Legal AIProfessional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
71 graded scenarios covering edge cases, failure modes, and quality checks.
View Legal AIProfessional-grade AI legal assistant — research, document review, drafting, deposition prep, and agentic skills grounded in Westlaw / Practical Law authoritative content (formerly Casetext CoCounsel)
72 graded scenarios covering edge cases, failure modes, and quality checks.
ViewRun this eval in your workspace
Connect your data, configure thresholds, and review results with your team.