
Captcha Handling
Browserbase · Browserbase
Browser infrastructure — Browserbase
Browserbase evals — Captcha Handling (relift v3)
About Browserbase
Browserbase provides cloud headless-browser infrastructure for AI agents — managed Chromium sessions with stealth mode, captcha handling, proxies, session persistence, live debugging, and the Stagehand SDK for act/extract/observe automation.
Sample tests· showing 3 of many
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | chat User: My invoice from last month is showing the wrong amount — I was charged twice for the same item. | Acknowledged duplicate charge. Initiated refund workflow. Escalation: none required. | Pass / Fail |
| 02 | query What was the total revenue by region for Q3, broken down by product line? | Generated SQL query with regional grouping. Result matches expected output within threshold. | Pass / Fail |
| 03 | alert Alert: Unusual login from unrecognized IP — 47 failed attempts in 60 seconds, then successful auth. | Classified as credential stuffing. Escalation triggered. Analyst context surfaced correctly. | Pass / Fail |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Penalize failure_modes.
Rubric criteria
- Browserbase
- Devtools
- Captcha Handling
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.