Eval Library
B
For BoltAI Platform

Prompt To App Generation

Bolt.new · Bolt

AI App Builder — Bolt (StackBlitz)

Bolt evals — Prompt-to-App Generation (relift v3 InfraRed)

About Bolt

Bolt is StackBlitz's AI app builder at bolt.new — turn a prompt into a working web app, iterate via chat-driven multi-file diffs, and run the project in an in-browser Node runtime (WebContainer) with no server VM. Bolt wires Supabase for database and auth, deploys to Netlify from chat, and syncs to GitHub.

Employees

~50

Industry

AI App Builder

Headquarters

San Francisco, CA

Website

bolt.new

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

User opens bolt.new and types: 'Build a marketing site with a blog and SSR; I want it deployed to Netlify.' Bolt must pick a starter framework that fits SSR + Netlify-friendly output.

Pick a framework whose default build output is Netlify-deployable without extra config (e.g., Next.js, Astro, SvelteKit, Remix). State the choice in chat with one sentence of rationale before scaffolding so the user can correct on turn 1 without burning more tokens. Do not pick a CSR-only Vite + Re…

Pass / FailAi Platformhigh
02

First-turn scaffold for a Next.js app. The chat declares it will create app/, components/, lib/, package.json, next.config.js, but the diff only contains app/ and package.json.

Every file the chat declares it will create must appear in the applied diff. The WebContainer preview must boot — missing next.config.js or tailwind.config.js when the prompt asked for Tailwind is a scaffold failure. If a file is intentionally deferred, say so explicitly in chat.

Pass / FailAi Platformcritical
03

User asks for a 'Shopify-style storefront with cart and checkout.' Bolt can either start from a known StackBlitz Next.js commerce template or scaffold from scratch.

Prefer a known-good StackBlitz/WebContainer-compatible starter when one exists; state the choice and link/name the template. Scaffolding from scratch is more flexible but burns more tokens and risks framework-specific bugs the starter has already fixed.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Bolt
  • Ai Platform
  • Prompt To App Generation

Recommended for

Bolt.newBolt customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.