
Predict Sync And Async
Baseten · Baseten
AI Model Serving — Baseten
Baseten evals — Predict (Sync + Async) (relift v3 InfraRed)
About Baseten
Baseten is a model serving platform that lets ML teams deploy, scale, and monitor any model — including custom fine-tunes and private weights — with production-grade autoscaling and GPU infrastructure. It supports both synchronous and asynchronous inference patterns.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Client POSTs to https://model-<id>.api.baseten.co/production/predict with the model's documented input JSON. The deployment is warm. Response is the model's raw output JSON (not wrapped in {data:...}). | Parse the response as the model's output schema directly — do not assume a Baseten-injected envelope. Status 200 means inference succeeded; non-2xx means the request failed at the platform layer (queueing, autoscaler, replica crash) and the response body carries an error object with code + message. | Pass / FailAi Platformhigh |
| 02 | Client POSTs to /production/async_predict with body containing the model input and webhook_endpoint. Response is immediate with request_id and status=QUEUED. | Persist request_id with the operator's job-tracking key BEFORE 200 is returned to the caller, so a crash between POST and webhook delivery does not orphan the request. Use either webhook delivery or GET /async_request/{request_id} polling as the result channel — pick one and document it. | Pass / FailAi Platformcritical |
| 03 | Operator calls POST /async_request/{request_id}/cancel on a QUEUED request. Status transitions to CANCELED. | Cancel is best-effort: requests still in QUEUED are stopped pre-compute; requests already IN_PROGRESS may still complete and bill for elapsed compute. Poll until CANCELED terminal status, then read any partial cost attribution from the usage record. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Baseten
- Ai Platform
- Predict Sync And Async
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.