Integrated Inference Embed And Rerank
Pinecone · Pinecone
Vector Database — Pinecone
Pinecone evals — Integrated Inference / Embed & Rerank (relift v3 InfraRed)
About Pinecone
Pinecone is a managed vector database for AI applications — serverless and pod-based indexes, namespaces for multi-tenant isolation, hybrid sparse-dense search, integrated inference (embed + rerank), and Pinecone Assistant for retrieval-augmented generation with citations.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Operator embeds with multilingual-e5-large (1024 dim) via /embed and upserts into an index created with dimension=1536. | Match index dimension to embedder output. multilingual-e5-large=1024, llama-text-embed-v2=1024 (configurable), pinecone-sparse-english-v0=sparse. Recreate the index at the correct dimension; do not pad or truncate. Document the embedder choice with the index. | Pass / FailAi Platformcritical |
| 02 | Operator calls /embed with input_type=passage for both corpus and queries on an asymmetric model (e.g. multilingual-e5-large). | Asymmetric embedders require input_type=passage for documents being indexed and input_type=query for search queries — the prefix changes the embedding. Mixing degrades retrieval. Verify per-model defaults [REQUIRES-VERIFICATION for the exact list of asymmetric models]. | Pass / FailAi Platformhigh |
| 03 | Operator queries top_k=100 from ANN, then reranks all 100 with /rerank, then takes top 5. | Set rerank top_n=5 explicitly. /rerank scores all input pairs and returns the top_n sorted descending. Trimming at the rerank layer avoids extra client-side sort and keeps the contract obvious in logs. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Pinecone
- Ai Platform
- Integrated Inference Embed And Rerank
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.