R
For ReductoAI PlatformAnswer Relevance

Extract Schema And Citations

Reducto · Reducto

Document Ingestion & Parsing for AI — Reducto

Reducto evals — Extract (Schema-driven + Citations) (relift v3 InfraRed)

About Reducto

Reducto is a document ingestion platform for AI pipelines that turns complex documents (PDFs, scans, spreadsheets) into clean, structured, layout-aware data. Its API parses documents into Markdown and typed content blocks, extracts structured fields against a user-defined schema with source citations, and splits bundled files into their constituent documents — feeding retrieval-augmented generation and document-automation workflows.

Employees

~50 (approx — verify)

Industry

Document AI / Data Ingestion

Headquarters

San Francisco, CA (verify)

Website

reducto.ai

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Integrator calls /extract with a JSON schema describing the fields to pull from an invoice (invoice_number, total, line_items[]). They make every field a loosely typed string and add no field descriptions.

Define a precise schema: correct types (numbers as numbers, dates as dates), arrays for repeated structures (line_items[]), and per-field descriptions that disambiguate (e.g., 'total' = grand total incl. tax). A well-specified schema is the primary lever on extraction accuracy. Validate returned va…

Pass / FailAi Platformhigh
02

Extract returns each field value alongside a citation pointing to the source location in the document (page + region/block). The integrator stores only the values and discards the citations.

Persist the citation alongside each extracted value so the field is auditable and a human can verify it against the source. Source-grounded extraction is the core trust feature — discarding citations turns a verifiable extraction into an unverifiable guess. Surface citations in any human-review UI.…

Pass / FailAi Platformcritical
03

Integrator already parses every document, then sends the full parsed text to a general LLM with an ad-hoc 'pull these fields' prompt instead of using /extract.

Prefer the purpose-built /extract path (schema-constrained, source-cited) over an ad-hoc LLM prompt for structured field extraction: it returns typed, validated, grounded output without bespoke prompt engineering or a separate hallucination guard. Reserve the parse-then-LLM path for open-ended summ…

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Reducto
  • Ai Platform
  • Extract Schema And Citations

Recommended for

ReductoReducto customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.