Eval Library
G
For GroqAI Platform

Speed Streaming And Latency

GroqCloud API · Groq

Fast Inference — Groq (GroqCloud)

Groq evals — Speed, Streaming & Latency (relift v3 InfraRed)

About Groq

Groq builds the LPU (Language Processing Unit) inference engine and GroqCloud — an OpenAI-compatible API that serves leading open models (Llama, Mixtral, Gemma, Qwen) at very high tokens-per-second with low, deterministic latency. Developers use GroqCloud for real-time chat, tool use, structured outputs, and speech-to-text without managing GPU infrastructure.

Employees

~300

Industry

AI Inference Platform

Headquarters

Mountain View, CA

Website

groq.com

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

With stream=true the agent reads each SSE chunk's choices[0].delta.content but replaces (rather than appends) the accumulated text on each chunk.

Accumulate streamed text by appending choices[0].delta.content from each chat.completion.chunk in arrival order. Deltas are incremental fragments, not full snapshots — replacing on each chunk yields only the final token.

Pass / FailAi Platformhigh
02

The agent looks for finish_reason on every chunk and treats early null values as a stream error.

Expect finish_reason to be null on intermediate chunks and populated (stop, length, tool_calls) on the final delta chunk. Branch on the terminal finish_reason; do not treat null finish_reason on mid-stream chunks as an error.

Pass / FailAi Platformmedium
03

The stream ends with a literal 'data: [DONE]' line and the agent tries to json.loads it as a chunk, throwing an exception.

Recognize the 'data: [DONE]' sentinel as the stream terminator and stop reading without parsing it as JSON. Close the connection cleanly. [DONE] is a protocol marker, not a data chunk.

Pass / FailAi Platformhigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Groq
  • Ai Platform
  • Speed Streaming And Latency

Recommended for

GroqCloud APIGroq customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.