For FirecrawlAI Platform

Scrape Single Url

Firecrawl · Firecrawl

Web Data for AI — Firecrawl

Evaluates Firecrawl's Scrape (single URL) across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Web Data for AI eval coverage.

About Firecrawl

Firecrawl is a web-data API for AI — it turns websites into clean, LLM-ready markdown or structured data via scrape, crawl, map, search, and LLM-powered extract endpoints, with JS rendering, browser actions, and proxies. Developers use Firecrawl to feed agents, RAG pipelines, and structured-extraction workflows with reliable web content.

Employees

~30

Industry

Web Data / Scraping

Headquarters

San Francisco, CA

Website

firecrawl.dev

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	Agent calls POST /v1/scrape for a docs page and needs only clean markdown for an LLM, but passes formats=['markdown','html','rawHtml','screenshot'] 'to be safe.'	Request only the formats actually consumed — formats=['markdown']. Each extra format (especially screenshot) adds latency and consumes additional credits. Read response.markdown from the per-format payload; do not request rawHtml/screenshot unless a downstream step uses them.	Pass / FailAi Platformmedium
02	A blog page returns markdown that includes the nav bar, cookie banner, and footer. The agent feeds the whole thing to an LLM and wastes tokens on chrome.	Set onlyMainContent=true so Firecrawl strips navigation, headers, footers, and boilerplate, returning the article body. Verify the returned markdown excludes nav/footer before treating token counts as content. Re-enable chrome only when the task specifically needs it.	Pass / FailAi Platformmedium
03	Agent wants only the <article> body and must drop <aside class='ads'> elements from a content-heavy page.	Use includeTags (e.g. ['article']) and excludeTags (e.g. ['aside','.ads']) CSS selectors to scope extraction at scrape time. Confirm the selectors match the target site's DOM; pair with onlyMainContent where appropriate rather than post-hoc string deletion.	Pass / FailAi Platformmedium
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Firecrawl
Ai Platform
Scrape Single Url

Recommended for

FirecrawlFirecrawl customers

Works with

Firecrawl

Related evals

AI Platform

Claude API

Evaluates Anthropic's Batch API across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Extended Thinking across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Files API & Citations across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View

Frequently asked questions

What does the Scrape Single Url eval for Firecrawl Firecrawl test?+

Evaluates Firecrawl's Scrape (single URL) across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Web Data for AI eval coverage.

How is the Scrape Single Url eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Scrape Single Url pack for Firecrawl Firecrawl contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Scrape Single Url pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.