
Page Automation Primitives
Browserbase (cloud headless Chromium + Stagehand SDK) · Browserbase
Browser Infrastructure for AI Agents — Browserbase
Browserbase evals — Page Automation Primitives (relift v3 InfraRed)
About Browserbase
Browserbase provides cloud headless-browser infrastructure for AI agents — managed Chromium sessions with stealth mode, captcha handling, proxies, session persistence, live debugging, and the Stagehand SDK for act/extract/observe automation.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Agent calls page.goto('https://shop.example/cart') and immediately calls page.click('#checkout'). | Pass wait_until='networkidle' or 'domcontentloaded' explicitly per the target's loading pattern. SPAs may need explicit wait_for_selector on a post-mount anchor — do not rely on the default 'load' for a JS-heavy cart. | Pass / FailAi Platformhigh |
| 02 | Form has input event listeners that validate on each keystroke (e.g., card-number spacing). Agent uses page.fill() and validation never fires. | Use page.type() (per-keystroke events) when the form depends on input events. Use fill() for non-event-driven fields (faster, atomic). Decide per field by inspecting the listener model via observe() or DOM inspection. | Pass / FailAi Platformmedium |
| 03 | Operator uses page.click('button.btn-primary') on a Tailwind site where the class is shared by 12 buttons. | Prefer Playwright's role/text-based locators (getByRole('button', {name:'Checkout'})) or Stagehand act() for resilience to class refactors. Avoid brittle class selectors as the primary target. | Pass / FailAi Platformhigh |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Browserbase
- Ai Platform
- Page Automation Primitives
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.