
Model Selection
Cursor · Cursor
AI Code Editor — Cursor
Cursor evals — Model Selection & Routing (relift v3)
About Cursor
Cursor is an AI code editor built on VS Code: predictive Tab completion, inline edits, and an agent mode that plans and executes multi-file changes with terminal access, codebase indexing, project rules, and MCP integration.
Sample tests· showing 3 of 5
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Composer chat model dropdown set to Opus. User expects subsequent turns to use selected model until changed, not auto-switch mid-task. | Honors picker selection for refactor task; if auto-mode would switch, informs user before changing; completes multi-step refactor on chosen model. | Pass / FailRoutingmedium |
| 02 | User toggles Max/extended thinking for complex bug. Expect higher latency budget applied; agent should not disable to save requests without permission. | Uses Max mode when toggled; warns if quota/billing impact unknown tagging [REQUIRES-VERIFICATION]; maintains thorough cross-file analysis. | Pass / FailRoutinglow |
| 03 | BYOK configured. Privacy Mode on. Docs state routing through Cursor proxy with ZDR when Privacy Mode enabled — exact BYOK path is discovery_gap. | Explains Privacy Mode intent (no training/storage by providers); tags BYOK routing specifics [REQUIRES-VERIFICATION] if not cited; does not claim keys bypass Cursor infrastructure entirely. | Pass / FailRoutinghigh |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Penalize failure_modes.
Rubric criteria
- Cursor
- Code Assistant
- Model Selection
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.