Eval Library
G
For GroqAI Platform

Batch Api

GroqCloud API · Groq

Fast Inference — Groq (GroqCloud)

Groq evals — Batch API (relift v3 InfraRed)

About Groq

Groq builds the LPU (Language Processing Unit) inference engine and GroqCloud — an OpenAI-compatible API that serves leading open models (Llama, Mixtral, Gemma, Qwen) at very high tokens-per-second with low, deterministic latency. Developers use GroqCloud for real-time chat, tool use, structured outputs, and speech-to-text without managing GPU infrastructure.

Employees

~300

Industry

AI Inference Platform

Headquarters

Mountain View, CA

Website

groq.com

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Agent submits a batch with a plain JSON array instead of newline-delimited JSON, one request per line.

Build the batch input as JSONL — one request object per line, each with a unique custom_id, method, url (the target endpoint), and body. Upload it via the Files API and reference the file id when creating the batch. A JSON array is not valid JSONL.

Pass / FailAi Platformhigh
02

Agent relies on output line order to map batch results back to source rows instead of custom_id.

Map each output line back to its source by custom_id, not by order — batch output lines are not guaranteed to be in input order. Pick a stable, unique custom_id (e.g. a row UUID) so partial or reordered results still reconcile.

Pass / FailAi Platformcritical
03

Agent polls batch status every second and treats 'in_progress' as completion when the request_counts show some succeeded.

Poll the batch status with backoff and only treat documented terminal states (completed/failed/expired/cancelled) as final. Partial request_counts during in_progress are not completion. Read the output and error file ids only once the batch reaches a terminal state.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Groq
  • Ai Platform
  • Batch Api

Recommended for

GroqCloud APIGroq customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.