Find an eval for what you’re building.
Start with the kind of work you’re evaluating — not a technical taxonomy.
Browse by connector

Browserbase
15 evals

LangSmith
15 evals

Deepgram
14 evals

Zendesk
11 evals

Harvey
10 evals

OpenEvidence
10 evals

Cursor
9 evals

Gemini
9 evals

Google Meet
9 evals

Google Workspace
9 evals

Replit
9 evals

Abridge
8 evals

Anthropic
8 evals
AssemblyAI
8 evals
Microsoft AutoGen
8 evals

Baseten
8 evals
Bolt
8 evals

Cartesia
8 evals
Cogent Security
8 evals

Cognition
8 evals

Cohere
8 evals
Composio
8 evals
CrewAI
8 evals
DeepSeek
8 evals
Docker
8 evals

fal
8 evals
Firecrawl
8 evals
GitHub Copilot
8 evals
Groq
8 evals
Hightouch
8 evals

Hume AI
8 evals
LangChain
8 evals

Linear
8 evals
LiveKit
8 evals
LlamaIndex
8 evals
Lovable
8 evals
Mem0
8 evals
Mercor
8 evals

Mistral AI
8 evals
n8n
8 evals

OpenAI
8 evals
Perplexity
8 evals
Pinecone
8 evals
Portkey
8 evals
Qdrant
8 evals
Replicate
8 evals

Retell AI
8 evals

Retool
8 evals

Sourcegraph
8 evals

Temporal
8 evals

Together AI
8 evals
Unstructured
8 evals

Vapi
8 evals

Vercel AI SDK
8 evals

Vercel
8 evals
Windsurf
8 evals
xAI
8 evals

Clay
7 evals

ElevenLabs
7 evals

Exa Labs
7 evals

Gong
7 evals

Granola
7 evals

Modal
7 evals

OpenRouter
7 evals

Sierra AI
7 evals

turbopuffer
7 evals

Aidoc
6 evals

Ambience Healthcare
6 evals

Atropos Health
6 evals

Bayesian Health
6 evals

Chainguard
6 evals

Cohere Health
6 evals

Commure / Augmedix
6 evals

Fireworks AI
6 evals

Glass Health
6 evals

Glean
6 evals

Hippocratic AI
6 evals

Innovaccer
6 evals

K Health
6 evals

Layer Health
6 evals

Notable Health
6 evals

Puzzle
6 evals

Sendoso
6 evals

WorkOS
6 evals

Dropzone AI
5 evals

Manifest OS
5 evals

Abnormal AI
4 evals

Arctic Wolf
4 evals

CrowdStrike
4 evals

Cyera
4 evals

Decagon
4 evals

Everlaw
4 evals

HiddenLayer
4 evals

Legora
4 evals

Mend.io
4 evals

Netskope
4 evals

Noma Security
4 evals

Orca Security
4 evals

Rubrik
4 evals

SentinelOne
4 evals

Snyk
4 evals

Straiker
4 evals

Suki AI
4 evals

Torq
4 evals

Zscaler
4 evals

CoCounsel (Thomson Reuters)
3 evals

DraftWise
3 evals

Ironclad
3 evals

LexisNexis
3 evals

Luminance
3 evals

Paxton AI
3 evals

Relativity
3 evals

Spellbook
3 evals

vLex (Vincent AI)
3 evals

Hebbia
2 evals

11x
1 eval

Databricks
1 eval

Jasper
1 eval

Notion
1 eval

StaffScheduleCare
1 eval

ThoughtSpot
1 eval
Featured evals

Eval Factory Imported Suite
Document extraction checks

Abnormal AI Email Security Adversarial Security Validation
Security safety checks

Abnormal AI Email Security Expert Safety Gate Eval
Security safety checks

Abnormal AI Email Security Power User Ops Eval
Security safety checks

Abnormal AI Email Security Workflow Painpoint Eval
Clinical safety checks

Ambient clinical documentation
Clinical safety checks