Eval Library
S
For SourcegraphCode Assistant

Cody Chat And Context

Sourcegraph (Cody + Amp) · Sourcegraph

Code Intelligence — Sourcegraph

Sourcegraph evals — Cody Chat & Context (relift v3 InfraRed)

About Sourcegraph

Sourcegraph is a code intelligence and AI coding platform: universal code search, precise code navigation, Cody chat grounded in your codebase, cross-repo batch changes, and the Amp autonomous agent — deployed across large enterprise codebases.

Employees

~150

Industry

Code Intelligence

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

User in VS Code Cody chat types '@auth.ts please summarise the session expiry logic'. Cody responds with a confident answer that names a method `refreshTokenLoop` not present in the file.

Per docs/cody, an @-mention pins the named file/symbol/repo into the model's context window; the response should ground claims in the actual content and cite line ranges. A hallucinated method name means either the file did not load (path mismatch, ignored by Cody) or the agent suppressed citations…

Pass / FailCode Assistantcritical
02

User asks Cody to 'compare retry logic between @acme/svc-a and @acme/svc-b' but only svc-a is cloned locally; svc-b lives on the Sourcegraph instance.

Per docs/cody, Cody resolves remote repo mentions against the configured Sourcegraph instance (Search API), pulling relevant context server-side. Operator must be signed in with an account that has ACL access to svc-b. Confirm the chip shows the remote repo before treating the comparison as grounde…

Pass / FailCode Assistanthigh
03

Team standardised on a custom Cody prompt 'Generate Jest tests for {selection}'. New hire's instance doesn't see the prompt in the menu.

Per docs/cody/capabilities/commands (custom prompts), prompts live in user, workspace, or org scope; org-scoped prompts require admin publish and user opt-in / org enrollment. Verify scope, publication state, and the new hire's enrollment; do not paste the prompt definition into a personal shortcut…

Pass / FailCode Assistantmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Sourcegraph
  • Code Assistant
  • Cody Chat And Context

Recommended for

Sourcegraph (Cody + Amp)Sourcegraph customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.