Eval Library
R
For ReplicateAI Platform

Predictions Api

Replicate · Replicate

AI Model Hosting — Replicate

Replicate evals — Predictions API (relift v3 InfraRed)

About Replicate

Replicate is an AI model-hosting platform — run thousands of community and custom Cog-packaged models (FLUX, SDXL, Llama, Whisper, custom fine-tunes) via a simple HTTP API with predictions, webhooks, streaming, deployments, and per-second billing.

Employees

~80

Industry

AI Inference Platform

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Integrator calls POST /v1/predictions with body {version: 'a1b2c3...', input: {...}} for an SDXL prediction. A teammate proposes switching to the model-slug form POST /v1/models/stability-ai/sdxl/predictions to 'always get the la…

Pin version id (immutable) for production traffic so behavior is reproducible. Use the model-slug form only for exploratory work or when you have an explicit auto-upgrade policy — latest_version moves on every upstream push and can silently change input schema, output shape, or safety behavior.

Pass / FailAi Platformcritical
02

Integrator wants a synchronous response from a fast FLUX-schnell prediction and sends header Prefer: wait=30.

Use Prefer: wait=<seconds> (up to ~60s) for short predictions where a single HTTP response is simpler than polling. If the prediction does not finish within the wait, Replicate returns the prediction in a non-terminal state and the client must fall back to polling /v1/predictions/{id} or webhooks —…

Pass / FailAi Platformhigh
03

Integrator submits an image-to-image SDXL prediction with a 4 MB input image base64-encoded as a data: URI in input.image.

For inputs above ~256 KB, upload via POST /v1/files and pass the returned URL in input.image. Inline data: URIs balloon the request body, slow upload, and may exceed per-request size caps. Small images (<256 KB) can stay inline.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Replicate
  • Ai Platform
  • Predictions Api

Recommended for

ReplicateReplicate customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.