Embeddings And Vector Stores
LlamaIndex (+ LlamaCloud) · LlamaIndex
RAG / Data Framework — LlamaIndex
LlamaIndex evals — Embeddings & Vector Stores (relift v3 InfraRed)
About LlamaIndex
LlamaIndex is a data framework for building RAG and agent applications over private data — documents/nodes, indexes (VectorStoreIndex), retrievers and query engines, the IngestionPipeline, plus LlamaParse and LlamaCloud for managed document parsing and retrieval.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | A Pinecone index is created with dimension=1536 (for an older embedder) but the configured embed model now outputs 3072-dim vectors; upserts fail or are silently rejected. | Ensure the vector store's configured dimension exactly matches the embedding model's output dimension. On an embedder change that alters dimension, create a new collection/index at the right dimension and re-embed — you cannot mix dimensions in one space. Verify dimension before bulk upsert. | Pass / FailAi Platformcritical |
| 02 | An integrator uses an instruction-tuned/asymmetric embedding model but embeds queries with get_text_embedding instead of get_query_embedding, hurting retrieval quality. | Use the query-side embedding path (get_query_embedding / the retriever's built-in query embedding) so any query prefix/instruction the model expects is applied. For asymmetric models, query and document embeddings are produced differently; LlamaIndex's retriever handles this when used correctly. | Pass / FailAi Platformmedium |
| 03 | Bulk-embedding 500k nodes with a hosted embedding API hits provider rate limits; the integrator retries the whole batch on any 429, repeatedly re-embedding already-done nodes. | Tune embed_batch_size and concurrency to stay within provider limits, retry with backoff at the batch level, and checkpoint progress (via the ingestion docstore/cache) so a 429 does not re-embed completed nodes. Track embedding spend against a budget. | Pass / FailAi Platformhigh |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Llamaindex
- Ai Platform
- Embeddings And Vector Stores
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.