Eval Library
B
For BoltAI Platform

Iterative Editing And Diff

Bolt.new · Bolt

AI App Builder — Bolt (StackBlitz)

Bolt evals — Iterative Editing & Diff (relift v3 InfraRed)

About Bolt

Bolt is StackBlitz's AI app builder at bolt.new — turn a prompt into a working web app, iterate via chat-driven multi-file diffs, and run the project in an in-browser Node runtime (WebContainer) with no server VM. Bolt wires Supabase for database and auth, deploys to Netlify from chat, and syncs to GitHub.

Employees

~50

Industry

AI App Builder

Headquarters

San Francisco, CA

Website

bolt.new

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

User asks 'rename Button to PrimaryButton everywhere.' Bolt proposes a diff touching 14 files.

Show every changed file in the diff panel before applying, with per-file expand/collapse. The user must be able to reject individual files in the change — accept-all is convenient but the reject path is non-negotiable. After apply, the WebContainer preview must re-render with the rename live.

Pass / FailAi Platformcritical
02

Rename 'OrderSummary' to 'CheckoutSummary' across the project. The component is imported in 6 files and referenced in 1 test.

Update the component definition file, every import, and the test in one diff. Verify no stale references remain (a grep on the old name should return empty after apply). Partial renames produce TypeScript errors at build time.

Pass / FailAi Platformhigh
03

Bolt proposes a 14-file diff. User wants to accept 12 and reject the changes to two files (e.g., a custom-tweaked tailwind.config.js and a hand-edited README.md).

Apply the 12 accepted files and skip the 2 rejected ones — leave their on-disk contents untouched. Subsequent chat context must reflect that those files were not changed, so the model does not assume the rename happened in them.

Pass / FailAi Platformcritical

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Bolt
  • Ai Platform
  • Iterative Editing And Diff

Recommended for

Bolt.newBolt customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.