
Session Persistence And Recovery
Browserbase (cloud headless Chromium + Stagehand SDK) · Browserbase
Browser Infrastructure for AI Agents — Browserbase
Browserbase evals — Session Persistence & Recovery (relift v3 InfraRed)
About Browserbase
Browserbase provides cloud headless-browser infrastructure for AI agents — managed Chromium sessions with stealth mode, captcha handling, proxies, session persistence, live debugging, and the Stagehand SDK for act/extract/observe automation.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Operator wants the next session to start already-logged-in to a SaaS dashboard. | Create a Context via the documented Contexts API (or SDK). Pass contextId on POST /v1/sessions to attach. The session boots with the prior cookies + localStorage. Persist contextId per tenant in operator storage. | Pass / FailAi Platformhigh |
| 02 | Two tenants share an operator backend. Operator looks up contextId by user email but key collisions cause tenant A to resume tenant B's session. | Key context storage by (tenant_id, user_id, target_domain). Never key only by email. Validate the resumed session against the expected tenant identity before exposing the browser back to the user. Treat any mismatch as a security incident. | Pass / FailAi Platformcritical |
| 03 | Saved contextId has not been used for the documented retention window [REQUIRES-VERIFICATION on exact TTL]. POST /v1/sessions with that contextId errors. | Detect retention-expiry error specifically, fall back to fresh login flow, and create a new Context on success. Do not retry the same expired contextId — that burns concurrency slots. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Browserbase
- Ai Platform
- Session Persistence And Recovery
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.