Eval Library
Browserbase
For BrowserbaseAI Platform

Live View Debug And Recordings

Browserbase (cloud headless Chromium + Stagehand SDK) · Browserbase

Browser Infrastructure for AI Agents — Browserbase

Browserbase evals — Live View / Debug & Recordings (relift v3 InfraRed)

About Browserbase

Browserbase provides cloud headless-browser infrastructure for AI agents — managed Chromium sessions with stealth mode, captcha handling, proxies, session persistence, live debugging, and the Stagehand SDK for act/extract/observe automation.

Employees

~40

Industry

Browser Infrastructure

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

On-call wants to live-watch a stuck session. They request the debug URL.

Call POST /v1/sessions/{id}/debug to obtain the documented live view URL [REQUIRES-VERIFICATION on exact field name]. Share the link only with authenticated Browserbase project members; treat it as a secret outside the dashboard.

Pass / FailAi Platformhigh
02

Operator wants to deep-link to a completed session in the Browserbase dashboard.

Use the documented dashboard URL pattern (e.g., https://browserbase.com/sessions/{id}) [REQUIRES-VERIFICATION on current pattern]. Do not screen-scrape the dashboard URL — it may change. Persist the id only; render the URL via a single config-driven template.

Pass / FailAi Platformmedium
03

Debugging a 502 from the target API; on-call opens the network panel in Session Inspector for the failed request.

Use Session Inspector network logs to inspect status, headers, timing. Redact sensitive headers (Authorization) before sharing screenshots externally. Treat network logs as containing potential PII.

Pass / FailAi Platformcritical

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Browserbase
  • Ai Platform
  • Live View Debug And Recordings

Recommended for

Browserbase (cloud headless Chromium + Stagehand SDK)Browserbase customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.