Eval Library
Browserbase
For BrowserbaseAI Platform

Auth And Concurrency

Browserbase (cloud headless Chromium + Stagehand SDK) · Browserbase

Browser Infrastructure for AI Agents — Browserbase

Browserbase evals — Auth & Concurrency (relift v3 InfraRed)

About Browserbase

Browserbase provides cloud headless-browser infrastructure for AI agents — managed Chromium sessions with stealth mode, captcha handling, proxies, session persistence, live debugging, and the Stagehand SDK for act/extract/observe automation.

Employees

~40

Industry

Browser Infrastructure

Headquarters

San Francisco, CA

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Agent's HTTP wrapper sends Authorization: Bearer <BB_KEY> instead of X-BB-API-KEY: <BB_KEY>.

Send the API key in X-BB-API-KEY as documented. Bearer is the wrong header and the API will reject with 401. Detect and surface the misconfiguration rather than retrying.

Pass / FailAi Platformcritical
02

Plan allows N concurrent sessions [REQUIRES-VERIFICATION on integer]. Operator bursts to N+10 sessions and gets 429 / queue.

Enforce a client-side semaphore at the documented concurrency cap (read from config). Honor 429 Retry-After if returned. Surface backpressure to the upstream queue rather than burst-creating.

Pass / FailAi Platformhigh
03

When concurrency is saturated, queue order is per-project FIFO [REQUIRES-VERIFICATION]. Operator assumes priority by API call timestamp.

Treat documented queueing as best-effort fairness; for true priority, implement an operator-side priority queue that holds back low-priority creates when high-priority jobs are waiting. Do not rely on server-side priority.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Browserbase
  • Ai Platform
  • Auth And Concurrency

Recommended for

Browserbase (cloud headless Chromium + Stagehand SDK)Browserbase customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.