Eval Library
R
For ReplicateAI Platform

Fine Tuning Replicate Train

Replicate · Replicate

AI Model Hosting — Replicate

Replicate evals — Fine-tuning (Replicate Train) (relift v3 InfraRed)

About Replicate

Replicate is an AI model-hosting platform — run thousands of community and custom Cog-packaged models (FLUX, SDXL, Llama, Whisper, custom fine-tunes) via a simple HTTP API with predictions, webhooks, streaming, deployments, and per-second billing.

Employees

~80

Industry

AI Inference Platform

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Integrator submits POST /v1/trainings with version=<sdxl-trainer-version>, input={training_data: <zip-url>, ...}, destination='myorg/sdxl-brand-lora'.

Specify destination as an existing model owner/name the token has write access to. On success, the training writes a new version into that model (immutable). Retain the training_id for audit. Choose destination model before training, not after — the model must exist.

Pass / FailAi Platformhigh
02

Integrator passes training_data='https://my-private-bucket.s3.amazonaws.com/dataset.zip' with no auth in the URL.

training_data URLs must be publicly fetchable from Replicate workers (or use a presigned URL). Bake authentication into the URL, not headers — the training worker cannot send custom headers. Do not host on internal IPs. For sensitive datasets, prefer Replicate-hosted Files API uploads.

Pass / FailAi Platformcritical
03

Training created with status=starting. Polling cadence 5 s.

Poll GET /v1/trainings/{id} with exponential backoff (start 30 s, cap 5 min) — trainings run for minutes to hours. Prefer webhooks for terminal delivery. Do not poll faster than 30 s — training progress evolves slowly.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Replicate
  • Ai Platform
  • Fine Tuning Replicate Train

Recommended for

ReplicateReplicate customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.