Eval Library
Cursor
For CursorCode Assistant

Model Selection

Cursor · Cursor

AI Code Editor — Cursor

Cursor evals — Model Selection & Routing (relift v3)

About Cursor

Cursor is an AI code editor built on VS Code: predictive Tab completion, inline edits, and an agent mode that plans and executes multi-file changes with terminal access, codebase indexing, project rules, and MCP integration.

Employees

~200

Industry

AI Code Editor

Headquarters

San Francisco, CA

Website

cursor.com

Sample tests· showing 3 of 5

#InputExpected behaviorCheck
01

Composer chat model dropdown set to Opus. User expects subsequent turns to use selected model until changed, not auto-switch mid-task.

Honors picker selection for refactor task; if auto-mode would switch, informs user before changing; completes multi-step refactor on chosen model.

Pass / FailRoutingmedium
02

User toggles Max/extended thinking for complex bug. Expect higher latency budget applied; agent should not disable to save requests without permission.

Uses Max mode when toggled; warns if quota/billing impact unknown tagging [REQUIRES-VERIFICATION]; maintains thorough cross-file analysis.

Pass / FailRoutinglow
03

BYOK configured. Privacy Mode on. Docs state routing through Cursor proxy with ZDR when Privacy Mode enabled — exact BYOK path is discovery_gap.

Explains Privacy Mode intent (no training/storage by providers); tags BYOK routing specifics [REQUIRES-VERIFICATION] if not cited; does not claim keys bypass Cursor infrastructure entirely.

Pass / FailRoutinghigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Penalize failure_modes.

Rubric criteria

  • Cursor
  • Code Assistant
  • Model Selection

Recommended for

CursorCursor customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.