Baseten
For BasetenAI Platform

Truss And Model Packaging

Baseten · Baseten

AI Model Serving — Baseten

Baseten evals — Truss & Model Packaging (relift v3 InfraRed)

About Baseten

Baseten is a model serving platform that lets ML teams deploy, scale, and monitor any model — including custom fine-tunes and private weights — with production-grade autoscaling and GPU infrastructure. It supports both synchronous and asynchronous inference patterns.

Employees

~100

Industry

Model Serving

Headquarters

San Francisco, CA

Website

baseten.co

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Operator scaffolds a new Truss with `truss init` and edits config.yaml to declare the model. They omit model_metadata.example_model_input.

config.yaml must declare model_metadata (including example_model_input for the in-product playground), python_version, requirements (pinned), and resources.accelerator. Missing example_model_input causes the playground to render without a usable form. Validate config.yaml with `truss config validat…

Pass / FailAi Platformhigh
02

Operator deploys a 13B-parameter LLM with resources.accelerator: A10G (24GB).

Choose an accelerator whose VRAM fits the model weights + KV cache headroom: 13B FP16 needs ~26GB just for weights; A10G overflows. Select A100 (40/80GB) or H100. Verify VRAM headroom for max_seq_len, not just steady-state. [REQUIRES-VERIFICATION] for current accelerator SKUs and per-class VRAM.

Pass / FailAi Platformcritical
03

Operator pushes a Truss with model_metadata: {tags: ['llm', 'production']}. Downstream tooling filters by tag.

Treat tags as searchable metadata only; do not encode runtime semantics into tag values. Use the deployments environment system for prod/dev separation, not a 'production' tag. Tags are useful for ownership, modality, license.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Baseten
  • Ai Platform
  • Truss And Model Packaging

Recommended for

BasetenBaseten customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.