Eval Library
L
For LlamaIndexAI Platform

Embeddings And Vector Stores

LlamaIndex (+ LlamaCloud) · LlamaIndex

RAG / Data Framework — LlamaIndex

LlamaIndex evals — Embeddings & Vector Stores (relift v3 InfraRed)

About LlamaIndex

LlamaIndex is a data framework for building RAG and agent applications over private data — documents/nodes, indexes (VectorStoreIndex), retrievers and query engines, the IngestionPipeline, plus LlamaParse and LlamaCloud for managed document parsing and retrieval.

Employees

~50

Industry

RAG Framework

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

A Pinecone index is created with dimension=1536 (for an older embedder) but the configured embed model now outputs 3072-dim vectors; upserts fail or are silently rejected.

Ensure the vector store's configured dimension exactly matches the embedding model's output dimension. On an embedder change that alters dimension, create a new collection/index at the right dimension and re-embed — you cannot mix dimensions in one space. Verify dimension before bulk upsert.

Pass / FailAi Platformcritical
02

An integrator uses an instruction-tuned/asymmetric embedding model but embeds queries with get_text_embedding instead of get_query_embedding, hurting retrieval quality.

Use the query-side embedding path (get_query_embedding / the retriever's built-in query embedding) so any query prefix/instruction the model expects is applied. For asymmetric models, query and document embeddings are produced differently; LlamaIndex's retriever handles this when used correctly.

Pass / FailAi Platformmedium
03

Bulk-embedding 500k nodes with a hosted embedding API hits provider rate limits; the integrator retries the whole batch on any 429, repeatedly re-embedding already-done nodes.

Tune embed_batch_size and concurrency to stay within provider limits, retry with backoff at the batch level, and checkpoint progress (via the ingestion docstore/cache) so a 429 does not re-embed completed nodes. Track embedding spend against a budget.

Pass / FailAi Platformhigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Llamaindex
  • Ai Platform
  • Embeddings And Vector Stores

Recommended for

LlamaIndex (+ LlamaCloud)LlamaIndex customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.