For Vercel AI SDKCode AssistantAi Platform

Embeddings Image Speech

Vercel AI SDK (`ai` npm package) · Vercel AI SDK

AI SDK — Vercel AI SDK

Evaluates Vercel AI SDK's Embeddings, Image & Speech across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's AI SDK eval coverage.

About Vercel AI SDK

Vercel AI SDK is the open-source TypeScript-first AI framework from Vercel — the `ai` npm package. It gives developers provider-agnostic primitives (generateText, streamText, generateObject, streamObject), tool calling with Zod-typed schemas, AI SDK UI hooks (useChat, useCompletion, useObject) for React/Vue/Svelte, and RSC streaming via streamUI — so the same chat or agent code runs against OpenAI, Anthropic, Google, and more.

Employees

~500

Industry

AI Framework / SDK

Headquarters

San Francisco, CA

Website

ai-sdk.dev

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	Operator embeds 5000 chunks by calling `embed({ model, value })` in a Promise.all loop and hits per-request rate limits.	Use `embedMany({ model, values: chunks, maxParallelCalls })` — it batches into provider-allowed group sizes and parallelizes within the per-provider limit. The SDK manages batch boundaries; the integrator does not need a custom chunker. Returns embeddings[] in input order plus aggregate usage.	Pass / FailAi Platformhigh
02	Operator hand-rolls cosine similarity over embedding[] with a subtle bug (forgot to normalize).	Use the SDK's `cosineSimilarity(a, b)` helper from 'ai'. It handles normalization correctly. Hand-rolled implementations are a frequent source of off-by-norm bugs; prefer the SDK helper for any production retrieval.	Pass / FailAi Platformmedium
03	Operator stores embeddings from text-embedding-3-small (1536-dim) and later switches to text-embedding-3-large (3072-dim) without re-indexing.	Embedding dimension is model-specific. Store the model id alongside vectors and refuse cross-model similarity. On model swap, re-embed the corpus before switching reads. Use the SDK's per-model dimension info (where available) to validate at write time.	Pass / FailAi Platformcritical
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Vercel Ai Sdk
Ai Platform
Embeddings Image Speech

Recommended for

Vercel AI SDK (`ai` npm package)Vercel AI SDK customers

Works with

Vercel AI SDK

Related evals

Code Assistant

Browserbase

Evaluates Browserbase's Captcha Handling across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View Code Assistant

Browserbase

Evaluates Browserbase's Concurrency & Rate Limits across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View Code Assistant

Browserbase

Evaluates Browserbase's Live Debugging & Session Inspector across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View

Frequently asked questions

What does the Embeddings Image Speech eval for Vercel AI SDK Vercel AI SDK (`ai` npm package) test?+

Evaluates Vercel AI SDK's Embeddings, Image & Speech across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's AI SDK eval coverage.

How is the Embeddings Image Speech eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Embeddings Image Speech pack for Vercel AI SDK Vercel AI SDK (`ai` npm package) contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Embeddings Image Speech pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.