For PineconeAI Platform

Pinecone Assistant

Pinecone · Pinecone

Vector Database — Pinecone

Evaluates Pinecone's Pinecone Assistant across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Vector Database eval coverage.

About Pinecone

Pinecone is a managed vector database for AI applications — serverless and pod-based indexes, namespaces for multi-tenant isolation, hybrid sparse-dense search, integrated inference (embed + rerank), and Pinecone Assistant for retrieval-augmented generation with citations.

Employees

~150

Industry

Vector Database

Headquarters

New York, NY

Website

www.pinecone.io

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	Operator uploads a 40 MB PDF to assistant 'support-bot' via /assistant/{name}/files.	Upload chunks per file size limit per docs [REQUIRES-VERIFICATION for current cap]. Files persist within the assistant until deleted; track file_id mappings in operator's own store for lifecycle. Files are private to the assistant — they are not shared with other assistants or indexes.	Pass / FailAi Platformmedium
02	User asks the Assistant 'what does our refund policy say about chargebacks?' The response cites refund_policy.pdf, page 12-14.	Render citations as inline links keyed to (file name, page range). Preserve cited_text verbatim. Show 'no citation' answers as unverified — do not surface ungrounded claims as policy. Log citation/no-citation ratio for telemetry.	Pass / FailAi Platformcritical
03	Operator wants streaming responses from /assistant/{name}/chat for low TTFB.	Per docs, Assistant chat supports streaming responses. Set stream=true (or use streaming endpoint variant) and process chunks incrementally. Citations arrive as a structured block alongside the streaming content; buffer until the block closes before rendering anchors.	Pass / FailAi Platformmedium
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Pinecone
Ai Platform
Pinecone Assistant

Recommended for

PineconePinecone customers

Works with

Pinecone

Related evals

AI Platform

Claude API

Evaluates Anthropic's Batch API across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Extended Thinking across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Files API & Citations across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View

Frequently asked questions

What does the Pinecone Assistant eval for Pinecone Pinecone test?+

Evaluates Pinecone's Pinecone Assistant across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Vector Database eval coverage.

How is the Pinecone Assistant eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Pinecone Assistant pack for Pinecone Pinecone contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Pinecone Assistant pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.