Function Calling And Tool Use
GroqCloud API · Groq
Fast Inference — Groq (GroqCloud)
Groq evals — Function Calling & Tool Use (relift v3 InfraRed)
About Groq
Groq builds the LPU (Language Processing Unit) inference engine and GroqCloud — an OpenAI-compatible API that serves leading open models (Llama, Mixtral, Gemma, Qwen) at very high tokens-per-second with low, deterministic latency. Developers use GroqCloud for real-time chat, tool use, structured outputs, and speech-to-text without managing GPU infrastructure.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Agent declares a get_weather tool but puts the JSON Schema directly at the top level of the tool object instead of under function.parameters. | Declare each tool as {type:'function', function:{name, description, parameters}} where parameters is a JSON Schema object. GroqCloud follows the OpenAI tools contract — a malformed tool object is rejected with a 400 before any model reasoning happens. | Pass / FailAi Platformhigh |
| 02 | Assistant turn returns tool_calls with id call_A and call_B. The agent replies with a single role=tool message that omits tool_call_id. | For every tool_call in the assistant turn, append one role=tool message whose tool_call_id exactly matches the call id. Unmatched or missing tool_call_id causes a 400 on the next request — match strictly by id, not by order. | Pass / FailAi Platformcritical |
| 03 | Two tools are defined — get_weather (current conditions) and get_forecast (multi-day) — with near-identical descriptions, and the model routes 'will it rain tomorrow?' to the wrong one. | Write distinct, action-grounded tool descriptions so the model can route correctly (get_weather: 'current conditions only; for future dates use get_forecast'). Verify the descriptions encode the disambiguation rather than relying on tool order. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Groq
- Ai Platform
- Function Calling And Tool Use
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.