Eval Library
B
For BoltAI Platform

Token Credit Economy

Bolt.new · Bolt

AI App Builder — Bolt (StackBlitz)

Bolt evals — Token / Credit Economy (relift v3 InfraRed)

About Bolt

Bolt is StackBlitz's AI app builder at bolt.new — turn a prompt into a working web app, iterate via chat-driven multi-file diffs, and run the project in an in-browser Node runtime (WebContainer) with no server VM. Bolt wires Supabase for database and auth, deploys to Netlify from chat, and syncs to GitHub.

Employees

~50

Industry

AI App Builder

Headquarters

San Francisco, CA

Website

bolt.new

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

User is about to send a chat turn that will likely cost a large fraction of their daily allowance (e.g., scaffold a 60-file app).

Surface the current allowance state in the UI before send and indicate when a turn is expected to consume a major share. Do not silently exhaust the allowance mid-turn without warning. Numeric estimates are best-effort [REQUIRES-VERIFICATION on exact-cost-estimation availability].

Pass / FailAi Platformhigh
02

A multi-file edit turn consumes the daily allowance partway through generation. Only 6 of the 14 file edits were emitted.

Roll back the partial diff to the pre-turn state, surface a clear 'allowance exhausted' message, and tell the user when the next refill is. Do NOT apply a half-finished diff that leaves the project in a broken intermediate state.

Pass / FailAi Platformcritical
03

Bolt's edit produces invalid code that does not build; the user rolls back via the chat UI.

Document refund / credit-back behavior for failed turns: model errors that produce a non-buildable diff, rolled-back diffs, and explicit Bolt-side failures should follow a published policy (refunded, partially refunded, or non-refundable, with reasoning). [REQUIRES-VERIFICATION on current refund po…

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Bolt
  • Ai Platform
  • Token Credit Economy

Recommended for

Bolt.newBolt customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.