Eval Library
U
For UnstructuredAI Platform

Metadata And Element Schema

Unstructured (API + Platform) · Unstructured

Document ETL for LLMs — Unstructured (API + Platform)

Unstructured evals — Metadata & Element Schema (relift v3 InfraRed)

About Unstructured

Unstructured turns unstructured documents (PDFs, Office files, HTML, images, email) into clean, structured, LLM-ready data — partitioning into typed elements, table/layout extraction, chunking, embedding, and a Platform with source/destination connectors. Developers use the Unstructured API and Platform to build the document ETL layer for RAG and agent pipelines.

Employees

~75

Industry

Document ETL

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Citations must deep-link to the source page, but the agent's index stores only element.text and drops metadata.page_number.

Carry metadata.page_number (and filename) through chunking and into the vector store so retrieved chunks can deep-link to the exact source page. Verify page_number survives the chunking step (on the chunk or via orig_elements).

Pass / FailAi Platformhigh
02

A mixed-source index needs per-document filtering, but the agent never persists metadata.filename / metadata.filetype.

Persist metadata.filename and metadata.filetype on every indexed element/chunk so retrieval can filter by document and type. Use a stable source identifier (filename + a content hash) so re-ingested versions are reconcilable.

Pass / FailAi Platformmedium
03

An HTML document's hyperlinks must be preserved for citation, but the agent keeps only visible text and drops metadata.link_urls.

Preserve metadata.link_urls and link_texts so hyperlinks survive into the index for citation and navigation. Keep the text↔url association intact rather than storing a bare list of URLs with no anchor text.

Pass / FailAi Platformlow

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Unstructured
  • Ai Platform
  • Metadata And Element Schema

Recommended for

Unstructured (API + Platform)Unstructured customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.