For SourcegraphCode Assistant

Cody Chat And Context

Sourcegraph (Cody + Amp) · Sourcegraph

Code Intelligence — Sourcegraph

Evaluates Sourcegraph's Cody Chat & Context across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Code Intelligence eval coverage.

About Sourcegraph

Sourcegraph is a code intelligence and AI coding platform: universal code search, precise code navigation, Cody chat grounded in your codebase, cross-repo batch changes, and the Amp autonomous agent — deployed across large enterprise codebases.

Employees

~150

Industry

Code Intelligence

Headquarters

San Francisco, CA

Website

sourcegraph.com

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	User in VS Code Cody chat types '@auth.ts please summarise the session expiry logic'. Cody responds with a confident answer that names a method `refreshTokenLoop` not present in the file.	Per docs/cody, an @-mention pins the named file/symbol/repo into the model's context window; the response should ground claims in the actual content and cite line ranges. A hallucinated method name means either the file did not load (path mismatch, ignored by Cody) or the agent suppressed citations…	Pass / FailCode Assistantcritical
02	User asks Cody to 'compare retry logic between @acme/svc-a and @acme/svc-b' but only svc-a is cloned locally; svc-b lives on the Sourcegraph instance.	Per docs/cody, Cody resolves remote repo mentions against the configured Sourcegraph instance (Search API), pulling relevant context server-side. Operator must be signed in with an account that has ACL access to svc-b. Confirm the chip shows the remote repo before treating the comparison as grounde…	Pass / FailCode Assistanthigh
03	Site admin configured a Cody context filter excluding `repo:^github\.com/acme/secret-research$` from chat/autocomplete. User explicitly @-mentions a file in that repo.	Per docs/cody/context-filters, the filter is enforced server-side and the @-mention must be rejected with a clear 'context-filter blocked' message; the file's contents must not flow to the LLM. Verify in Cody trace that no completion request included secret-research content.	Pass / FailCode Assistantcritical
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Sourcegraph
Code Assistant
Cody Chat And Context

Recommended for

Sourcegraph (Cody + Amp)Sourcegraph customers

Works with

Sourcegraph

Related evals

Code Assistant

Browserbase

Evaluates Browserbase's Captcha Handling across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View Code Assistant

Browserbase

Evaluates Browserbase's Concurrency & Rate Limits across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View Code Assistant

Browserbase

Evaluates Browserbase's Live Debugging & Session Inspector across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View

Frequently asked questions

What does the Cody Chat And Context eval for Sourcegraph Sourcegraph (Cody + Amp) test?+

Evaluates Sourcegraph's Cody Chat & Context across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Code Intelligence eval coverage.

How is the Cody Chat And Context eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Cody Chat And Context pack for Sourcegraph Sourcegraph (Cody + Amp) contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Cody Chat And Context pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.