Images And Multimodal
Perplexity Sonar API · Perplexity
Grounded Answer API — Perplexity Sonar
Perplexity evals — Images & Multimodal (relift v3 InfraRed)
About Perplexity
Perplexity is an answer engine; the Perplexity Sonar API exposes its grounded LLM with real-time web search and inline citations — sonar, sonar-pro, and sonar-reasoning models, source filtering and recency controls, and OpenAI-compatible chat completions for grounded answers at API scale.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Operator wants to render image thumbnails alongside text answers for a travel-research UI. | Set return_images=true on the request. The response includes an images array of URLs sourced from retrieved pages. Render with alt-text fallback and treat as untrusted external URLs (no credential leakage on preview, sandboxed image loader). | Pass / FailAi Platformmedium |
| 02 | search_domain_filter=['nytimes.com']; images[]=['https://cdn.nyt.com/...', 'https://images.unsplash.com/...']. The unsplash image is from a non-allow-listed host. | It is undocumented whether return_images respects search_domain_filter [REQUIRES-VERIFICATION]. Treat images[] as potentially outside the filter — validate hosts client-side against the allow-list before rendering, or fall back to text-only when the host mismatch matters for compliance. | Pass / FailAi Platformhigh |
| 03 | Some queries return images=[] even with return_images=true (no useful images on retrieved pages). | Treat images=[] as a normal outcome, not a failure. Hide the image rail rather than show a broken-image placeholder. Do not retry to 'get images' — the absence is informative. | Pass / FailAi Platformlow |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Perplexity
- Ai Platform
- Images And Multimodal
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.