R
For ReductoAI Platform

Parse Document To Structured

Reducto · Reducto

Document Ingestion & Parsing for AI — Reducto

Reducto evals — Parse (Document to Structured) (relift v3 InfraRed)

About Reducto

Reducto is a document ingestion platform for AI pipelines that turns complex documents (PDFs, scans, spreadsheets) into clean, structured, layout-aware data. Its API parses documents into Markdown and typed content blocks, extracts structured fields against a user-defined schema with source citations, and splits bundled files into their constituent documents — feeding retrieval-augmented generation and document-automation workflows.

Employees

~50 (approx — verify)

Industry

Document AI / Data Ingestion

Headquarters

San Francisco, CA (verify)

Website

reducto.ai

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Client POSTs a PDF to the /parse endpoint and receives a result containing the document rendered as layout-aware Markdown plus a structured list of content blocks (each with a type and bounding-box position). The integrator assum…

Consume both the Markdown rendering AND the per-block structure: each block carries a type (heading, paragraph, table, figure, list, etc.) and page-relative position. Downstream RAG indexing should preserve block boundaries rather than collapsing to one string, so retrieval can cite a specific bloc…

Pass / FailAi Platformhigh
02

Integrator needs clean Markdown for an LLM prompt but configures the parse request to return the most verbose structured JSON, then re-serializes it to Markdown themselves in application code.

Request the output representation that matches the downstream consumer: a Markdown/text rendering for LLM prompting, the structured block JSON for programmatic indexing. Do not hand-roll a Markdown serializer over the JSON — that reintroduces the reading-order and layout bugs Reducto already solved…

Pass / FailAi Platformmedium
03

A 40-page report has two-column layouts on some pages and single-column on others. The integrator concatenates parsed text in raw top-to-bottom pixel order without trusting Reducto's reading-order reconstruction.

Rely on Reducto's reading-order reconstruction (which sequences multi-column and interleaved layouts into logical reading order) rather than naive top-to-bottom sort. Verify on a sample of mixed-layout pages that columns are not interleaved mid-sentence before trusting the order in bulk indexing.

Pass / FailAi Platformhigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Reducto
  • Ai Platform
  • Parse Document To Structured

Recommended for

ReductoReducto customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.