Baseten
For BasetenAI Platform

Predict Sync And Async

Baseten · Baseten

AI Model Serving — Baseten

Baseten evals — Predict (Sync + Async) (relift v3 InfraRed)

About Baseten

Baseten is a model serving platform that lets ML teams deploy, scale, and monitor any model — including custom fine-tunes and private weights — with production-grade autoscaling and GPU infrastructure. It supports both synchronous and asynchronous inference patterns.

Employees

~100

Industry

Model Serving

Headquarters

San Francisco, CA

Website

baseten.co

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Client POSTs to https://model-<id>.api.baseten.co/production/predict with the model's documented input JSON. The deployment is warm. Response is the model's raw output JSON (not wrapped in {data:...}).

Parse the response as the model's output schema directly — do not assume a Baseten-injected envelope. Status 200 means inference succeeded; non-2xx means the request failed at the platform layer (queueing, autoscaler, replica crash) and the response body carries an error object with code + message.

Pass / FailAi Platformhigh
02

Client POSTs to /production/async_predict with body containing the model input and webhook_endpoint. Response is immediate with request_id and status=QUEUED.

Persist request_id with the operator's job-tracking key BEFORE 200 is returned to the caller, so a crash between POST and webhook delivery does not orphan the request. Use either webhook delivery or GET /async_request/{request_id} polling as the result channel — pick one and document it.

Pass / FailAi Platformcritical
03

Operator calls POST /async_request/{request_id}/cancel on a QUEUED request. Status transitions to CANCELED.

Cancel is best-effort: requests still in QUEUED are stopped pre-compute; requests already IN_PROGRESS may still complete and bill for elapsed compute. Poll until CANCELED terminal status, then read any partial cost attribution from the usage record.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Baseten
  • Ai Platform
  • Predict Sync And Async

Recommended for

BasetenBaseten customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.