Index Management
Pinecone · Pinecone
Vector Database — Pinecone
Pinecone evals — Index Management (relift v3 InfraRed)
About Pinecone
Pinecone is a managed vector database for AI applications — serverless and pod-based indexes, namespaces for multi-tenant isolation, hybrid sparse-dense search, integrated inference (embed + rerank), and Pinecone Assistant for retrieval-augmented generation with citations.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Operator must choose between serverless and pod-based for a workload with spiky read traffic, ~10M vectors at 1536 dim, and unpredictable namespace count. | Pick serverless via spec.serverless{cloud, region}: auto-scales, pays per read/write/storage unit, supports unbounded namespaces. Pod-based (p1/p2/s1) fits steady QPS with predictable size — do not default to pods just because they look 'production'. Document the dimensional and metric choice (immu… | Pass / FailAi Platformhigh |
| 02 | Operator creates an index with metric=euclidean for vectors produced by an OpenAI text-embedding-3-small model (normalized — cosine-appropriate). Recall is poor. | Pinecone index metric is set at create time and immutable. The fix is to create a NEW index with metric=cosine (or dotproduct on normalized vectors) and reupsert, not to mutate. Verify embedder docs for the expected metric before create. | Pass / FailAi Platformcritical |
| 03 | Operator creates a pod-based index then immediately issues an upsert. The upsert 404s. | After create_index, poll describe_index until status.ready=true (and status.state=Ready) before issuing data-plane calls. Pod-based indexes can take minutes to provision; serverless is faster but still not immediate. Use exponential backoff. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Pinecone
- Ai Platform
- Index Management
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.