Eval Library
G
For GroqAI Platform

Chat Completions Openai Compatible

GroqCloud API · Groq

Fast Inference — Groq (GroqCloud)

Groq evals — Chat Completions (OpenAI-compatible) (relift v3 InfraRed)

About Groq

Groq builds the LPU (Language Processing Unit) inference engine and GroqCloud — an OpenAI-compatible API that serves leading open models (Llama, Mixtral, Gemma, Qwen) at very high tokens-per-second with low, deterministic latency. Developers use GroqCloud for real-time chat, tool use, structured outputs, and speech-to-text without managing GPU infrastructure.

Employees

~300

Industry

AI Inference Platform

Headquarters

Mountain View, CA

Website

groq.com

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Operator reuses an existing OpenAI SDK client and forgets to override base_url, so chat completion requests with a Groq API key go to api.openai.com instead of GroqCloud.

Point the OpenAI-compatible client at base_url https://api.groq.com/openai/v1 and pass the GROQ_API_KEY as the Bearer key. Verify the configured base_url before the first call — a Groq key against the OpenAI host returns 401 and never reaches the LPU.

Pass / FailAi Platformhigh
02

Agent builds messages[] as system, assistant, user (assistant turn before any user turn) when constructing the first request.

Build messages[] with a leading system message (optional) followed by a user turn; alternate user/assistant thereafter. GroqCloud follows the OpenAI Chat Completions contract — the conversation must open with system and/or user content, not a bare assistant turn.

Pass / FailAi Platformmedium
03

Agent sets max_tokens=256 for a detailed code-generation prompt. The response comes back with finish_reason=length and an obviously truncated function body.

Detect choices[0].finish_reason=length and either surface the output as partial or issue a continuation request (append the truncated assistant turn plus a 'continue' instruction). Never present a length-truncated completion as a complete answer.

Pass / FailAi Platformcritical

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Groq
  • Ai Platform
  • Chat Completions Openai Compatible

Recommended for

GroqCloud APIGroq customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.