Eval Library
OpenAI
For OpenAIAI Platform

Structured Outputs And Json Schema

OpenAI API · OpenAI

Foundation Model & API — OpenAI (GPT)

OpenAI evals — Structured Outputs & JSON Schema (relift v3 InfraRed)

About OpenAI

OpenAI builds the GPT model family and the OpenAI API — Responses and Chat Completions, function calling, Structured Outputs, embeddings, fine-tuning, the Batch API, moderation, the Realtime API, and the Agents SDK — used by developers to build AI products at scale.

Employees

~3,000

Industry

Foundation Model

Headquarters

San Francisco, CA

Website

openai.com

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Agent needs guaranteed-parseable JSON for a downstream DB write and uses response_format json_schema with strict:true.

With strict:true the output conforms to the supplied JSON Schema; still wrap json.loads in error handling for refusal cases. Verify required/additionalProperties:false are set as strict mode requires.

Pass / FailAi Platformhigh
02

Schema enum for status is [open,closed]; downstream DB also allows 'pending'. Model can never emit 'pending'.

Keep the schema enum in sync with the downstream domain or intentionally constrain it; document that the model cannot produce values outside the enum. Surface a mapping if narrowing is deliberate.

Pass / FailAi Platformmedium
03

TS/Python SDK offers a parse helper that validates against the schema type; team hand-rolls json.loads instead.

Prefer the SDK's structured parse helper which returns typed objects and surfaces refusals; hand-rolled parsing loses the refusal path and type safety.

Pass / FailAi Platformlow

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Openai
  • Ai Platform
  • Structured Outputs And Json Schema

Recommended for

OpenAI APIOpenAI customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.