Eval Library
XA
For xAIAI Platform

Live Search And Deepsearch

xAI API (Grok) · xAI

Foundation Model & API — xAI (Grok)

xAI evals — Live Search / DeepSearch (relift v3 InfraRed)

About xAI

xAI builds the Grok foundation-model family and the xAI API — OpenAI-compatible chat completions, function calling, Live Search / DeepSearch real-time web grounding, Grok Vision multimodal inputs, reasoning models with a thinking-effort budget, and Grok / Aurora image generation.

Employees

~1,000

Industry

Foundation Model

Headquarters

Palo Alto, CA

Website

x.ai

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Agent enables real-time web grounding via search_parameters={mode:'on', sources:[...]} on /v1/chat/completions. Operator wants Grok to ground its answer in live search results.

Pass search_parameters at the top level (not inside messages) with mode set to 'on' (force), 'auto' (Grok decides), or 'off' (disable). [REQUIRES-VERIFICATION] for exact field names/values against docs.x.ai/api. Verify the response includes citation metadata before relying on grounding.

Pass / FailAi Platformhigh
02

Enterprise operator wants Grok grounded only in approved domains (company docs, regulator sites) and explicitly excludes social media.

Use search_parameters.sources with an allow-list and/or excluded_websites deny-list per docs.x.ai/api. [REQUIRES-VERIFICATION] on per-source-type (web/news/x/rss) shape. Test that disallowed domains never appear in returned citations.

Pass / FailAi Platformcritical
03

Response includes citations (URLs) in a citations[] field. Agent's UI shows only the answer text and hides the source list.

Render each citation as a clickable link adjacent to the supported claim. If the answer is presented without sources, mark it 'ungrounded' or warn. Preserve citation order as returned — it conveys relevance signal.

Pass / FailAi Platformhigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Xai
  • Ai Platform
  • Live Search And Deepsearch

Recommended for

xAI API (Grok)xAI customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.