Baseten
For BasetenAI Platform

Training And Finetuning

Baseten · Baseten

AI Model Serving — Baseten

Baseten evals — Training & Fine-tuning (relift v3 InfraRed)

About Baseten

Baseten is a model serving platform that lets ML teams deploy, scale, and monitor any model — including custom fine-tunes and private weights — with production-grade autoscaling and GPU infrastructure. It supports both synchronous and asynchronous inference patterns.

Employees

~100

Industry

Model Serving

Headquarters

San Francisco, CA

Website

baseten.co

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Operator POSTs to /v1/training/jobs with base_model, dataset_uri (s3://...), GPU class (H100), and hyperparameters. Response carries job_id.

Persist job_id immediately with dataset hash + hyperparameter snapshot for reproducibility. Status transitions PENDING → RUNNING → SUCCEEDED|FAILED|CANCELED. On SUCCEEDED, fetch checkpoint URI from the job record and validate it deserializes before deploying.

Pass / FailAi Platformhigh
02

Dataset is 12 GB in operator's S3 bucket. Operator passes dataset_uri=s3://my-bucket/data.jsonl in the job spec.

Baseten reads dataset_uri using the workspace's configured S3 credentials. Verify the workspace has IAM access to the bucket BEFORE submitting the job; otherwise the job will start, fail at dataset-load, and consume queue time. Use workspace-attached storage when cross-account IAM is friction.

Pass / FailAi Platformcritical
03

Fine-tuning a 13B model with LoRA. Operator picks H100 80GB for 'speed.'

For LoRA fine-tuning of a 13B model, A100 80GB is typically sufficient and cheaper than H100. Pick H100 only for full-parameter fine-tunes or when throughput-per-dollar wins. Compute $/epoch across SKUs, not $/hour, before committing. [REQUIRES-VERIFICATION] for current GPU-second prices.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Baseten
  • Ai Platform
  • Training And Finetuning

Recommended for

BasetenBaseten customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.