For PerplexityAI PlatformAnswer Relevance

Citations And Source Grounding

Perplexity Sonar API · Perplexity

Grounded Answer API — Perplexity Sonar

Evaluates Perplexity's Citations & Source Grounding across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Grounded Answer API eval coverage.

About Perplexity

Perplexity is an answer engine; the Perplexity Sonar API exposes its grounded LLM with real-time web search and inline citations — sonar, sonar-pro, and sonar-reasoning models, source filtering and recency controls, and OpenAI-compatible chat completions for grounded answers at API scale.

Employees

~200

Industry

Search / Answer API

Headquarters

San Francisco, CA

Website

www.perplexity.ai

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	Operator omits return_citations on POST /chat/completions and expects the response.citations[] array to populate by default.	Set return_citations=true explicitly on every request where source attribution is required downstream. The default behavior should not be assumed [REQUIRES-VERIFICATION] across model versions — opt in by setting the flag and verify response.citations is present and non-empty before rendering.	Pass / FailAi Platformhigh
02	Assistant content contains markers like 'The deal closed in 2025 [1] for $40M [2].' citations=['https://nyt.com/...', 'https://wsj.com/...'].	[n] is 1-indexed and maps to citations[n-1]. Render [1] as a link to citations[0]. Preserve the bracketed marker if it is part of the model's anchoring; rewrite as a hyperlink only after confirming the index→URL mapping per request.	Pass / FailAi Platformcritical
03	Response claims 'Company X raised $40M Series B in March 2025' with citations=['https://techcrunch.com/2024/01/x-seed/']. The cited page actually describes a $5M seed in January 2024.	Run a claim-by-claim faithfulness check: each numeric/date/entity assertion must be supported by at least one cited source. On mismatch, route the claim to a verification queue and either down-weight the answer, re-prompt with stricter grounding, or flag to the user. Do not surface unverified-by-ci…	Pass / FailAi Platformcritical
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Perplexity
Ai Platform
Citations And Source Grounding

Recommended for

Perplexity Sonar APIPerplexity customers

Works with

Perplexity

Related evals

AI Platform

Claude API

Evaluates Anthropic's Batch API across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Extended Thinking across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Files API & Citations across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View

Frequently asked questions

What does the Citations And Source Grounding eval for Perplexity Perplexity Sonar API test?+

How is the Citations And Source Grounding eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Citations And Source Grounding pack for Perplexity Perplexity Sonar API contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Citations And Source Grounding pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.