Eval Library
L
For LovableCode AssistantAi Platform

Iterative Editing

Lovable · Lovable

AI App Builder — Lovable

Lovable evals — Iterative Editing (relift v3 InfraRed)

About Lovable

Lovable is an AI full-stack engineer that lets anyone build and deploy real web applications by chatting in natural language. Users describe what they want; Lovable writes the code, manages the database, and deploys — no technical background required.

Employees

~30

Industry

AI App Builder

Headquarters

Stockholm, Sweden

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

User asks 'change the CTA button color to dark green on the landing page.' Only the landing-page CTA should change.

Apply the diff to the single CTA component on the landing route. Do not also edit unrelated components, restyle the navbar, or refactor adjacent files. Preview shows only the requested change; diff is small and reviewable.

Pass / FailAi Platformhigh
02

User asks to rename the 'Pricing' route to 'Plans.' This requires editing the route file, the nav link, the sitemap, and any internal Link references.

Apply all related edits as one atomic commit so the app stays consistent (no broken links between files). Run a type-check / build inside the same turn to catch missed references. Surface any reference the agent intentionally left (e.g., external marketing link).

Pass / FailAi Platformcritical
03

Previous turn added a contact form. User now says 'make it better.'

Ask one clarifying question (style? validation? backend wiring?) or propose two concrete improvements with the assumption visible. Do not silently restyle and ship an arbitrary 'better' that may regress the user's intent.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Lovable
  • Ai Platform
  • Iterative Editing

Recommended for

LovableLovable customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.