Eval Library
L
For LlamaIndexAI Platform

Llamaparse And Llamacloud

LlamaIndex (+ LlamaCloud) · LlamaIndex

RAG / Data Framework — LlamaIndex

LlamaIndex evals — LlamaParse / LlamaCloud (relift v3 InfraRed)

About LlamaIndex

LlamaIndex is a data framework for building RAG and agent applications over private data — documents/nodes, indexes (VectorStoreIndex), retrievers and query engines, the IngestionPipeline, plus LlamaParse and LlamaCloud for managed document parsing and retrieval.

Employees

~50

Industry

RAG Framework

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

An integrator uses LlamaParse fast mode on complex financial PDFs with nested tables, then complains tables are flattened and misaligned in the parsed markdown.

Select the LlamaParse parse mode for the document complexity — higher-fidelity modes (e.g. accurate / premium / multimodal) for complex tables and layouts, fast mode for simple text — trading cost/latency for fidelity. Validate parsed output on representative documents before bulk processing.

Pass / FailAi Platformmedium
02

LlamaParse parsing is asynchronous (submit then poll). The integrator reads the result immediately after submit and treats a not-yet-ready job as a parse failure.

Treat parsing as an async job: submit, then poll job status with backoff until it reaches a terminal SUCCESS/ERROR state (or use the SDK's blocking helper) before reading results. Distinguish 'still processing' from 'failed' and handle ERROR jobs explicitly.

Pass / FailAi Platformhigh
03

The LlamaCloud API key is embedded in client-side code / a notebook shared publicly so LlamaParse calls work from the browser.

Keep LLAMA_CLOUD_API_KEY server-side, loaded from env/secret manager, never in client code or shared notebooks. Proxy LlamaParse/LlamaCloud calls through a backend. Rotate the key if exposed; a leaked key allows unauthorized billable parsing.

Pass / FailAi Platformcritical

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Llamaindex
  • Ai Platform
  • Llamaparse And Llamacloud

Recommended for

LlamaIndex (+ LlamaCloud)LlamaIndex customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.