Eval Library
L
For LovableCode AssistantAi Platform

Quality And Errors

Lovable · Lovable

AI App Builder — Lovable

Lovable evals — Quality & Errors (relift v3 InfraRed)

About Lovable

Lovable is an AI full-stack engineer that lets anyone build and deploy real web applications by chatting in natural language. Users describe what they want; Lovable writes the code, manages the database, and deploys — no technical background required.

Employees

~30

Industry

AI App Builder

Headquarters

Stockholm, Sweden

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Existing app uses React 18. Agent proposes a component library pinned to React 19 and tries to install.

Detect the peer-dependency conflict before install (npm/yarn/pnpm output), surface it in chat, and choose between (a) finding a React 18 compatible version or (b) proposing the React 19 upgrade explicitly as a separate change. Do not --force / --legacy-peer-deps silently.

Pass / FailAi Platformhigh
02

Generated dashboard fetches from Supabase and crashes the whole page on a fetch error.

Add a React error boundary around the route or component, render a friendly fallback with a retry action, and log the error in the documented Lovable error feedback channel. Do not show a blank page or a raw stack trace to the end-user.

Pass / FailAi Platformmedium
03

User loads the live preview. A null-pointer fires on the dashboard route. The runtime error is captured and routed back into the agent loop.

Read the error, locate the failing component + line, propose a guarded fix (optional chaining + skeleton state, or correct loading sequence), apply, and verify by reloading. Bound retries (e.g., max 3 auto-fix attempts) to avoid infinite loops that consume credits.

Pass / FailAi Platformhigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Lovable
  • Ai Platform
  • Quality And Errors

Recommended for

LovableLovable customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.