For FirecrawlAI Platform

Crawl Whole Site

Firecrawl · Firecrawl

Web Data for AI — Firecrawl

Evaluates Firecrawl's Crawl (whole site) across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Web Data for AI eval coverage.

About Firecrawl

Firecrawl is a web-data API for AI — it turns websites into clean, LLM-ready markdown or structured data via scrape, crawl, map, search, and LLM-powered extract endpoints, with JS rendering, browser actions, and proxies. Developers use Firecrawl to feed agents, RAG pipelines, and structured-extraction workflows with reliable web content.

Employees

~30

Industry

Web Data / Scraping

Headquarters

San Francisco, CA

Website

firecrawl.dev

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	Agent calls POST /v1/crawl and immediately tries to read pages from the response body, but the response only contains a job id and a status URL.	Treat crawl as asynchronous: capture the returned job id, then poll GET /v1/crawl/{id} until status=completed (or failed/cancelled) before consuming data[]. Use exponential backoff on polling rather than a tight loop.	Pass / FailAi Platformhigh
02	Agent kicks off a crawl on a large e-commerce site with no limit or maxDepth, intending to grab only the top-level category pages.	Bound the crawl with limit (max pages) and maxDepth (link hops from the seed) matched to the intent. An unbounded crawl on a large site burns credits and may never terminate within budget. Start small, inspect, then widen.	Pass / FailAi Platformhigh
03	Agent only wants /blog/* pages and must skip /admin and /cart. It relies on post-filtering crawl results instead of scoping the crawl.	Scope the crawl with includePaths (e.g. ['^/blog/.*']) and excludePaths (e.g. ['^/admin','^/cart']) so out-of-scope pages are never fetched — saving credits and avoiding sensitive areas. Confirm the regexes are anchored to path, not full URL, per docs.	Pass / FailAi Platformhigh
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Firecrawl
Ai Platform
Crawl Whole Site

Recommended for

FirecrawlFirecrawl customers

Works with

Firecrawl

Related evals

AI Platform

Claude API

Evaluates Anthropic's Batch API across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Extended Thinking across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Files API & Citations across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View

Frequently asked questions

What does the Crawl Whole Site eval for Firecrawl Firecrawl test?+

Evaluates Firecrawl's Crawl (whole site) across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Web Data for AI eval coverage.

How is the Crawl Whole Site eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Crawl Whole Site pack for Firecrawl Firecrawl contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Crawl Whole Site pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.