Eval Library
R
For ReplicateAI Platform

Models Versions And Schema

Replicate · Replicate

AI Model Hosting — Replicate

Replicate evals — Models, Versions & Schema (relift v3 InfraRed)

About Replicate

Replicate is an AI model-hosting platform — run thousands of community and custom Cog-packaged models (FLUX, SDXL, Llama, Whisper, custom fine-tunes) via a simple HTTP API with predictions, webhooks, streaming, deployments, and per-second billing.

Employees

~80

Industry

AI Inference Platform

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Integrator constructs GET /v1/models/stability-ai/sdxl from a user-typed string 'Stability-AI / SDXL '.

Normalize the slug to lowercase owner/name with no whitespace; validate against the documented charset before sending. Reject ambiguous inputs client-side instead of forwarding and letting the API 404. Model slugs are case-sensitive in API URLs.

Pass / FailAi Platformmedium
02

FLUX-pro version 'a1b2c3...' is pinned in production. Replicate publishes version 'd4e5f6...' as latest_version on the model.

Pinned version id continues to serve identically — version ids are immutable. Surface 'newer version available' to the operator out-of-band (e.g., a weekly diff job), but do not auto-upgrade. To roll forward, fetch the new version's schema, run regression evals, then update the pin.

Pass / FailAi Platformcritical
03

GET /v1/models/{owner}/{name}/versions/{version_id} returns OpenAPI input schema with prompt:string, num_inference_steps:integer (1-50), guidance_scale:number (1.0-20.0).

Generate the operator's UI form fields from the OpenAPI schema (min/max/enum, defaults). Re-introspect on every version pin change. Treat the schema as source of truth; do not hardcode field lists in frontend code.

Pass / FailAi Platformhigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Replicate
  • Ai Platform
  • Models Versions And Schema

Recommended for

ReplicateReplicate customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.