Eval Library

Find an eval for what you’re building.

Start with the kind of work you’re evaluating — not a technical taxonomy.

Loading…

Browse by connector

Browserbase

Browserbase

15 evals

LangSmith

LangSmith

15 evals

Deepgram

Deepgram

14 evals

Zendesk

Zendesk

11 evals

Harvey

Harvey

10 evals

OpenEvidence

OpenEvidence

10 evals

Cursor

Cursor

9 evals

Gemini

Gemini

9 evals

Google Meet

Google Meet

9 evals

Google Workspace

Google Workspace

9 evals

Replit

Replit

9 evals

Abridge

Abridge

8 evals

Anthropic

Anthropic

8 evals

A

AssemblyAI

8 evals

MA

Microsoft AutoGen

8 evals

Baseten

Baseten

8 evals

B

Bolt

8 evals

Cartesia

Cartesia

8 evals

CS

Cogent Security

8 evals

Cognition

Cognition

8 evals

Cohere

Cohere

8 evals

C

Composio

8 evals

C

CrewAI

8 evals

D

DeepSeek

8 evals

D

Docker

8 evals

fal

fal

8 evals

F

Firecrawl

8 evals

GitHub Copilot

GitHub Copilot

8 evals

G

Groq

8 evals

H

Hightouch

8 evals

Hume AI

Hume AI

8 evals

L

LangChain

8 evals

Linear

Linear

8 evals

L

LiveKit

8 evals

L

LlamaIndex

8 evals

L

Lovable

8 evals

M

Mem0

8 evals

M

Mercor

8 evals

Mistral AI

Mistral AI

8 evals

N8

n8n

8 evals

OpenAI

OpenAI

8 evals

P

Perplexity

8 evals

P

Pinecone

8 evals

P

Portkey

8 evals

Q

Qdrant

8 evals

R

Replicate

8 evals

Retell AI

Retell AI

8 evals

Retool

Retool

8 evals

Sourcegraph

Sourcegraph

8 evals

Temporal

Temporal

8 evals

Together AI

Together AI

8 evals

U

Unstructured

8 evals

Vapi

Vapi

8 evals

Vercel AI SDK

Vercel AI SDK

8 evals

Vercel

Vercel

8 evals

W

Windsurf

8 evals

XA

xAI

8 evals

Clay

Clay

7 evals

ElevenLabs

ElevenLabs

7 evals

Exa Labs

Exa Labs

7 evals

Gong

Gong

7 evals

Granola

Granola

7 evals

Modal

Modal

7 evals

OpenRouter

OpenRouter

7 evals

Sierra AI

Sierra AI

7 evals

turbopuffer

turbopuffer

7 evals

Aidoc

Aidoc

6 evals

Ambience Healthcare

Ambience Healthcare

6 evals

Atropos Health

Atropos Health

6 evals

Bayesian Health

Bayesian Health

6 evals

Chainguard

Chainguard

6 evals

Cohere Health

Cohere Health

6 evals

Commure / Augmedix

Commure / Augmedix

6 evals

Fireworks AI

Fireworks AI

6 evals

Glass Health

Glass Health

6 evals

Glean

Glean

6 evals

Hippocratic AI

Hippocratic AI

6 evals

Innovaccer

Innovaccer

6 evals

K Health

K Health

6 evals

Layer Health

Layer Health

6 evals

Notable Health

Notable Health

6 evals

Puzzle

Puzzle

6 evals

Sendoso

Sendoso

6 evals

WorkOS

WorkOS

6 evals

Dropzone AI

Dropzone AI

5 evals

Manifest OS

Manifest OS

5 evals

Abnormal AI

Abnormal AI

4 evals

Arctic Wolf

Arctic Wolf

4 evals

CrowdStrike

CrowdStrike

4 evals

Cyera

Cyera

4 evals

Decagon

Decagon

4 evals

Everlaw

Everlaw

4 evals

HiddenLayer

HiddenLayer

4 evals

Legora

Legora

4 evals

Mend.io

Mend.io

4 evals

Netskope

Netskope

4 evals

Noma Security

Noma Security

4 evals

Orca Security

Orca Security

4 evals

Rubrik

Rubrik

4 evals

SentinelOne

SentinelOne

4 evals

Snyk

Snyk

4 evals

Straiker

Straiker

4 evals

Suki AI

Suki AI

4 evals

Torq

Torq

4 evals

Zscaler

Zscaler

4 evals

CoCounsel (Thomson Reuters)

CoCounsel (Thomson Reuters)

3 evals

DraftWise

DraftWise

3 evals

Ironclad

Ironclad

3 evals

LexisNexis

LexisNexis

3 evals

Luminance

Luminance

3 evals

Paxton AI

Paxton AI

3 evals

Relativity

Relativity

3 evals

Spellbook

Spellbook

3 evals

vLex (Vincent AI)

vLex (Vincent AI)

3 evals

Hebbia

Hebbia

2 evals

11x

11x

1 eval

Databricks

Databricks

1 eval

Jasper

Jasper

1 eval

Notion

Notion

1 eval

StaffScheduleCare

StaffScheduleCare

1 eval

ThoughtSpot

ThoughtSpot

1 eval