
Embeddings And Retrieval
OpenAI API · OpenAI
Foundation Model & API — OpenAI (GPT)
OpenAI evals — Embeddings & Retrieval (relift v3 InfraRed)
About OpenAI
OpenAI builds the GPT model family and the OpenAI API — Responses and Chat Completions, function calling, Structured Outputs, embeddings, fine-tuning, the Batch API, moderation, the Realtime API, and the Agents SDK — used by developers to build AI products at scale.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Team uses text-embedding-3-large but sets dimensions=256 to save vector-store cost, then compares to vectors stored at full dimension. | All vectors in an index must share the same model and dimensions; re-embed the whole corpus when changing dimensions. Mixing dimensions makes cosine similarity meaningless. | Pass / FailAi Platformcritical |
| 02 | Retrieval ranks by dot product over un-normalized embeddings and gets inconsistent ordering. | OpenAI embeddings are normalized to length 1, so cosine == dot product; if the vector store re-scales or you mix sources, normalize consistently. Verify the distance metric matches the store config. | Pass / FailAi Platformmedium |
| 03 | Org migrates embedding model; old and new vectors coexist during backfill. | Version the index by model; serve queries only against the matching-model partition until backfill completes, then cut over atomically. Never compare cross-model vectors. | Pass / FailAi Platformhigh |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Openai
- Ai Platform
- Embeddings And Retrieval
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.