Eval Library
Cursor
For CursorCode AssistantCode Checker

Tab Completion

Cursor · Cursor

AI Code Editor — Cursor

Cursor evals — Tab Completion (relift v3)

About Cursor

Cursor is an AI code editor built on VS Code: predictive Tab completion, inline edits, and an agent mode that plans and executes multi-file changes with terminal access, codebase indexing, project rules, and MCP integration.

Employees

~200

Industry

AI Code Editor

Headquarters

San Francisco, CA

Website

cursor.com

Sample tests· showing 3 of 6

#InputExpected behaviorCheck
01

TypeScript `src/routes/invoices.ts` cursor is inside `async (req, res) => {` after `try {`. Tab offers a single-line `res.json({ ok: true });` ghost text but the handler must validate `req.params.id`, call `billingService.charge`…

Tab suggests a multi-line block covering validation, service call, and catch with `res.status(400|500).json({ error })` matching project patterns; user can Tab-accept full block or partial-accept line-by-line.

Pass / FailCode Completionhigh
02

File `crates/api/src/handlers/user.rs` already imports `crate::db::DbError` and `User`. Cursor on `match fetch_user(id).await {` with no arm yet. Tab should propose `Ok(user) =>` and `Err(e) =>` arms consistent with Axum `IntoRes…

Completion uses existing imports, does not duplicate `use` lines, and proposes `Err(e)` mapping consistent with neighboring handlers (e.g., `AppError::from(e)`).

Pass / FailCode Completionmedium
03

Python `services/parser.py` — Tab suggests full `def parse_config(path: str) -> Config:` plus docstring and body. Operator wants signature + type hints only, then will write validation manually.

Partial accept (word/line accept per Tab docs) applies only through the signature line; remaining ghost text stays available or dismisses without inserting full body.

Pass / FailCode Completionlow

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Penalize failure_modes.

Rubric criteria

  • Cursor
  • Code Assistant
  • Tab Completion

Recommended for

CursorCursor customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.