Eval Library
R
For ReplicateAI Platform

Cog And Custom Model Push

Replicate · Replicate

AI Model Hosting — Replicate

Replicate evals — Cog & Custom Model Push (relift v3 InfraRed)

About Replicate

Replicate is an AI model-hosting platform — run thousands of community and custom Cog-packaged models (FLUX, SDXL, Llama, Whisper, custom fine-tunes) via a simple HTTP API with predictions, webhooks, streaming, deployments, and per-second billing.

Employees

~80

Industry

AI Inference Platform

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Integrator's cog.yaml declares build.gpu=true but does not specify a GPU class. Push succeeds but predictions OOM on the assigned tier.

Match the predict.py memory footprint to the documented per-tier VRAM (T4 16GB, A40 48GB, A100 40/80GB, H100 80GB). Declare the target hardware tier on the model in the Replicate UI (or via the API) — build.gpu=true is necessary but not sufficient. Test with a representative input before shipping.

Pass / FailAi Platformhigh
02

Custom Llama-7B model loads weights into GPU memory on every predict() call, adding 18 s per request.

Load weights once in setup() (called on container boot) and hold them in self.model. predict() should be a hot-path call against the warm model. Cold-start cost amortizes across all subsequent predictions on that container. [REQUIRES-VERIFICATION] for current container-lifetime caps.

Pass / FailAi Platformcritical
03

Long-running custom model wants to emit progress (denoising step k of N) to the prediction.logs field.

Use cog logging (print() or logging) to write progress lines; they appear in prediction.logs and stream over SSE/webhooks for clients that opt into the logs event filter. Do not stuff progress into output — output is the model's final payload.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Replicate
  • Ai Platform
  • Cog And Custom Model Push

Recommended for

ReplicateReplicate customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.