Fine Tuning Replicate Train
Replicate · Replicate
AI Model Hosting — Replicate
Replicate evals — Fine-tuning (Replicate Train) (relift v3 InfraRed)
About Replicate
Replicate is an AI model-hosting platform — run thousands of community and custom Cog-packaged models (FLUX, SDXL, Llama, Whisper, custom fine-tunes) via a simple HTTP API with predictions, webhooks, streaming, deployments, and per-second billing.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Integrator submits POST /v1/trainings with version=<sdxl-trainer-version>, input={training_data: <zip-url>, ...}, destination='myorg/sdxl-brand-lora'. | Specify destination as an existing model owner/name the token has write access to. On success, the training writes a new version into that model (immutable). Retain the training_id for audit. Choose destination model before training, not after — the model must exist. | Pass / FailAi Platformhigh |
| 02 | Integrator passes training_data='https://my-private-bucket.s3.amazonaws.com/dataset.zip' with no auth in the URL. | training_data URLs must be publicly fetchable from Replicate workers (or use a presigned URL). Bake authentication into the URL, not headers — the training worker cannot send custom headers. Do not host on internal IPs. For sensitive datasets, prefer Replicate-hosted Files API uploads. | Pass / FailAi Platformcritical |
| 03 | Training created with status=starting. Polling cadence 5 s. | Poll GET /v1/trainings/{id} with exponential backoff (start 30 s, cap 5 min) — trainings run for minutes to hours. Prefer webhooks for terminal delivery. Do not poll faster than 30 s — training progress evolves slowly. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Replicate
- Ai Platform
- Fine Tuning Replicate Train
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.