
Function Calling And Tool Orchestration
OpenAI API · OpenAI
Foundation Model & API — OpenAI (GPT)
OpenAI evals — Function Calling & Tool Orchestration (relift v3 InfraRed)
About OpenAI
OpenAI builds the GPT model family and the OpenAI API — Responses and Chat Completions, function calling, Structured Outputs, embeddings, fine-tuning, the Batch API, moderation, the Realtime API, and the Agents SDK — used by developers to build AI products at scale.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Workflow requires the next turn to call some tool (no plain text); operator wants to force a tool call. | Set tool_choice='required' so the model must call one of the provided tools. To force a specific tool, use tool_choice={type:'function',function:{name:...}}. Handle the returned tool_calls[]. | Pass / FailAi Platformhigh |
| 02 | Assistant turn returns two tool_calls (get_weather SF, get_weather NYC), each with a distinct id; agent executes both. | Return one role=tool message per tool_call_id, matched exactly by id, before the next model turn. Every tool_call must be answered. | Pass / FailAi Platformcritical |
| 03 | Two tools: search_orders (by order id) and search_customers (by email). User asks 'find the order for jane@acme.com'. | Descriptions must encode routing — search_orders should state it needs an order id, search_customers that it resolves a customer by email first. Verify the model can chain customer→order rather than guessing. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Openai
- Ai Platform
- Function Calling And Tool Orchestration
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.