Eval Library
U
For UnstructuredAI Platform

Partition Document To Elements

Unstructured (API + Platform) · Unstructured

Document ETL for LLMs — Unstructured (API + Platform)

Unstructured evals — Partition (document to elements) (relift v3 InfraRed)

About Unstructured

Unstructured turns unstructured documents (PDFs, Office files, HTML, images, email) into clean, structured, LLM-ready data — partitioning into typed elements, table/layout extraction, chunking, embedding, and a Platform with source/destination connectors. Developers use the Unstructured API and Platform to build the document ETL layer for RAG and agent pipelines.

Employees

~75

Industry

Document ETL

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Agent POSTs a contract PDF to /general/v0/general and receives a JSON array of elements. Downstream code only reads element.text and discards element.type.

Preserve each element's `type` (Title, NarrativeText, ListItem, Table, Image, Header, Footer, PageBreak, Address) alongside text. Downstream RAG/indexing must branch on type — e.g. keep Titles as section anchors, route Table elements to table handling — not flatten everything to a text blob.

Pass / FailAi Platformhigh
02

An .eml email with sender, subject, quoted reply chain, and an attachment is partitioned. Agent indexes the whole thing as one NarrativeText.

Partition separates email metadata (sender/recipient/subject via element metadata) from body elements and quoted-reply blocks. Preserve the body element types and keep subject/sender available in metadata for downstream filtering, rather than collapsing the message into a single blob.

Pass / FailAi Platformmedium
03

A multi-section report is partitioned. The agent re-sorts elements alphabetically by text before chunking 'to dedupe.'

Preserve the reading-order sequence returned by partition — element order encodes document flow (Title precedes its NarrativeText, list items follow their lead-in). Do not re-sort; ordering is load-bearing for by_title chunking and parent_id hierarchy.

Pass / FailAi Platformhigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Unstructured
  • Ai Platform
  • Partition Document To Elements

Recommended for

Unstructured (API + Platform)Unstructured customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.