Eval Library
Browserbase
For BrowserbaseCode AssistantDevtools

Session Persistence Contexts

Browserbase · Browserbase

Browser infrastructure — Browserbase

Browserbase evals — Session Persistence & Contexts (relift v3)

About Browserbase

Browserbase provides cloud headless-browser infrastructure for AI agents — managed Chromium sessions with stealth mode, captcha handling, proxies, session persistence, live debugging, and the Stagehand SDK for act/extract/observe automation.

Employees

~40

Industry

Browser Infrastructure

Headquarters

San Francisco, CA

Sample tests· showing 3 of many

#InputExpected behaviorCheck
01chat

User: My invoice from last month is showing the wrong amount — I was charged twice for the same item.

Acknowledged duplicate charge. Initiated refund workflow. Escalation: none required.

Pass / Fail
02query

What was the total revenue by region for Q3, broken down by product line?

Generated SQL query with regional grouping. Result matches expected output within threshold.

Pass / Fail
03alert

Alert: Unusual login from unrecognized IP — 47 failed attempts in 60 seconds, then successful auth.

Classified as credential stuffing. Escalation triggered. Analyst context surfaced correctly.

Pass / Fail

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Penalize failure_modes.

Rubric criteria

  • Browserbase
  • Devtools
  • Session Persistence Contexts

Recommended for

BrowserbaseBrowserbase customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.