For CohereAI Platform

Embed

Cohere API · Cohere

Foundation Model & API — Cohere

Evaluates Cohere's Embed across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

About Cohere

Cohere builds enterprise foundation models and the tools around them — the Command model family, best-in-class Rerank and Embed endpoints, and grounded retrieval-augmented generation with inline citations — deployable across major clouds and private VPCs.

Employees

~400

Industry

Foundation Model

Headquarters

Toronto, Canada

Website

cohere.com

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	Agent embeds corpus passages and user queries with the same input_type, then wonders why retrieval recall is poor.	Set input_type='search_document' when embedding corpus passages for indexing and input_type='search_query' when embedding the query at search time. The asymmetric input_type is required for the retrieval embedding space to align.	Pass / FailAi Platformcritical
02	A downstream classifier is trained on embeddings produced with input_type='search_document', then serves with input_type='classification'.	Use input_type='classification' for classification features and input_type='clustering' for clustering, and keep the same input_type at train and inference time. Mixing input_types between training and serving shifts the embedding distribution and degrades the model.	Pass / FailAi Platformhigh
03	Operator changes the requested embedding dimension (output_dimension) for new documents but the existing index was built at the old dimension.	All vectors in one index must share the same dimensionality. Changing the output dimension requires a full re-embed and re-index; never mix dimensions in a single index. Validate vector length on write.	Pass / FailAi Platformcritical
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Cohere
Ai Platform
Embed

Recommended for

Cohere APICohere customers

Works with

Cohere

Related evals

AI Platform

Claude API

Evaluates Anthropic's Batch API across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Extended Thinking across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View AI Platform

Claude API

Evaluates Anthropic's Files API & Citations across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

View

Frequently asked questions

What does the Embed eval for Cohere Cohere API test?+

Evaluates Cohere's Embed across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Foundation Model & API eval coverage.

How is the Embed eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Embed pack for Cohere Cohere API contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Embed pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.