For SourcegraphCode Assistant

Precise Code Navigation

Sourcegraph (Cody + Amp) · Sourcegraph

Code Intelligence — Sourcegraph

Evaluates Sourcegraph's Precise Code Navigation across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Code Intelligence eval coverage.

About Sourcegraph

Sourcegraph is a code intelligence and AI coding platform: universal code search, precise code navigation, Cody chat grounded in your codebase, cross-repo batch changes, and the Amp autonomous agent — deployed across large enterprise codebases.

Employees

~150

Industry

Code Intelligence

Headquarters

San Francisco, CA

Website

sourcegraph.com

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	User clicks go-to-definition on `applyPatch` in a TS file. The popup shows three candidate definitions across the repo, two from node_modules. No precise (SCIP) badge appears.	Per docs/code_navigation, when a SCIP index is present for the commit + language, the navigation returns a single precise definition. The multi-candidate UI indicates Sourcegraph fell back to search-based heuristics — i.e., no fresh SCIP index. Trigger or wait for `scip-typescript` upload for this …	Pass / FailCode Assistanthigh
02	Operator wants every caller of public API `acme/lib.Sanitize` across the entire instance, not just within the defining repo.	With cross-repo SCIP indexes uploaded (matching SCIP symbol moniker scheme per docs/code_navigation/explanations/uploads), Find References returns callers across repos. If only the defining repo is indexed precisely, callers in other repos surface via search-based fallback — clearly flagged in the …	Pass / FailCode Assistanthigh
03	Repo has a SCIP index uploaded for commit ABC. User navigates on branch `feature/x` whose latest commit XYZ has refactored `applyPatch` into `applyDelta`. Hover shows the old signature.	Per docs/code_navigation, indexes are commit-scoped; on a non-indexed commit Sourcegraph falls back to the nearest-ancestor indexed commit and surfaces a 'stale data' indicator. Treat hover details as approximate until the auto-indexer (or user-triggered upload) catches up to XYZ.	Pass / FailCode Assistanthigh
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Sourcegraph
Code Assistant
Precise Code Navigation

Recommended for

Sourcegraph (Cody + Amp)Sourcegraph customers

Works with

Sourcegraph

Related evals

Code Assistant

Browserbase

Evaluates Browserbase's Captcha Handling across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View Code Assistant

Browserbase

Evaluates Browserbase's Concurrency & Rate Limits across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View Code Assistant

Browserbase

Evaluates Browserbase's Live Debugging & Session Inspector across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View

Frequently asked questions

What does the Precise Code Navigation eval for Sourcegraph Sourcegraph (Cody + Amp) test?+

Evaluates Sourcegraph's Precise Code Navigation across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Code Intelligence eval coverage.

How is the Precise Code Navigation eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Precise Code Navigation pack for Sourcegraph Sourcegraph (Cody + Amp) contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Precise Code Navigation pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.