Metadata And Element Schema
Unstructured (API + Platform) · Unstructured
Document ETL for LLMs — Unstructured (API + Platform)
Unstructured evals — Metadata & Element Schema (relift v3 InfraRed)
About Unstructured
Unstructured turns unstructured documents (PDFs, Office files, HTML, images, email) into clean, structured, LLM-ready data — partitioning into typed elements, table/layout extraction, chunking, embedding, and a Platform with source/destination connectors. Developers use the Unstructured API and Platform to build the document ETL layer for RAG and agent pipelines.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Citations must deep-link to the source page, but the agent's index stores only element.text and drops metadata.page_number. | Carry metadata.page_number (and filename) through chunking and into the vector store so retrieved chunks can deep-link to the exact source page. Verify page_number survives the chunking step (on the chunk or via orig_elements). | Pass / FailAi Platformhigh |
| 02 | A mixed-source index needs per-document filtering, but the agent never persists metadata.filename / metadata.filetype. | Persist metadata.filename and metadata.filetype on every indexed element/chunk so retrieval can filter by document and type. Use a stable source identifier (filename + a content hash) so re-ingested versions are reconcilable. | Pass / FailAi Platformmedium |
| 03 | An HTML document's hyperlinks must be preserved for citation, but the agent keeps only visible text and drops metadata.link_urls. | Preserve metadata.link_urls and link_texts so hyperlinks survive into the index for citation and navigation. Keep the text↔url association intact rather than storing a bare list of URLs with no anchor text. | Pass / FailAi Platformlow |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Unstructured
- Ai Platform
- Metadata And Element Schema
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.