Eval Library
Modal
For ModalAI PlatformCode Assistant

Function Runtime Cold Start

Modal · Modal

Modal evals — Function Runtime & Cold Start (relift v3)

About Modal

Modal is a serverless cloud platform for running GPU workloads, ML inference, data pipelines, and web apps — all from Python, with no infrastructure to manage. Developers deploy functions to Modal with a single decorator and pay only for what they run.

Employees

~50

Industry

Serverless AI Infrastructure

Headquarters

New York, NY

Website

modal.com

Sample tests· showing 3 of 11

#InputExpected behaviorCheck
01

Training job uses @app.function(gpu='A10G', memory=32768, timeout=3600) on Image.debian_slim().pip_install('torch'). Logs show CUDA OOM at step 400; model needs ~40GB VRAM. Docs list gpu='A100-80GB' for large fine-tunes.

Agent updates decorator to gpu='A100-80GB' (or documents trade-off with gradient checkpointing), redeploys via modal deploy, and records GPU choice rationale tied to VRAM estimate.

Pass / FailTool usehigh
02

Latency SLO p95<800ms on @app.function(min_containers=0, scaledown_window=60) serving Llama via .remote(). Dashboard shows 4–6s cold starts when scaled to zero. Finance caps idle spend at $200/mo.

Agent sets min_containers=1 (or buffer_containers=1) with cost note, keeps scaledown_window reasonable, and validates p95 after modal deploy.

Pass / FailTool usemedium
03

@app.function(retries=3, timeout=120) calls external billing API via httpx without idempotency keys. Modal retries whole invocation on TimeoutError; finance sees triple charges.

Agent adds idempotency keys or moves billing to idempotent modal.Dict gate, sets retries only for safe steps, or wraps external call with dedupe keyed by input hash.

Pass / FailSafetyhigh

Rubric criteria

  • Modal
  • Serverless Gpu
  • Function Runtime Cold Start

Recommended for

ModalModal customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.