For Vercel AI SDKCode AssistantAi Platform

Core Generate And Stream Text

Vercel AI SDK (`ai` npm package) · Vercel AI SDK

AI SDK — Vercel AI SDK

Evaluates Vercel AI SDK's Core: generateText / streamText across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's AI SDK eval coverage.

About Vercel AI SDK

Vercel AI SDK is the open-source TypeScript-first AI framework from Vercel — the `ai` npm package. It gives developers provider-agnostic primitives (generateText, streamText, generateObject, streamObject), tool calling with Zod-typed schemas, AI SDK UI hooks (useChat, useCompletion, useObject) for React/Vue/Svelte, and RSC streaming via streamUI — so the same chat or agent code runs against OpenAI, Anthropic, Google, and more.

Employees

~500

Industry

AI Framework / SDK

Headquarters

San Francisco, CA

Website

ai-sdk.dev

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	Operator passes both `prompt: 'summarize this'` and `messages: [{role:'user', content:'hi'}]` to generateText, expecting messages to win.	Pass either `prompt` (shorthand single-user-turn) or `messages` (multi-turn) but not both — per docs they are mutually exclusive and the SDK throws InvalidArgumentError. Pick one and migrate the other content into it.	Pass / FailAi Platformhigh
02	User closes the browser tab mid-stream; the Next.js route handler's request.signal aborts. streamText was called with abortSignal:request.signal.	Propagating request.signal to streamText closes the upstream provider HTTP connection so generation stops and billing for remaining tokens halts (per provider). Verify by reading onFinish.usage and confirming finishReason indicates cancellation rather than a full completion.	Pass / FailAi Platformhigh
03	Operator places system instructions as `messages[0] = { role:'system', content:'You are a helpful assistant.' }` and also passes `system: 'Be concise.'` to streamText.	Pass system instructions via the top-level `system` parameter — the SDK merges it as the leading system message. Do not duplicate by also placing a system role in messages[]: providers either reject duplicates (Anthropic) or concatenate them in undefined order. Pick one location.	Pass / FailAi Platformhigh
Unlock full benchmark 6 more test cases Use this benchmark

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Vercel Ai Sdk
Ai Platform
Core Generate And Stream Text

Recommended for

Vercel AI SDK (`ai` npm package)Vercel AI SDK customers

Works with

Vercel AI SDK

Related evals

Code Assistant

Browserbase

Evaluates Browserbase's Captcha Handling across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View Code Assistant

Browserbase

Evaluates Browserbase's Concurrency & Rate Limits across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View Code Assistant

Browserbase

Evaluates Browserbase's Live Debugging & Session Inspector across scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's Browser infrastructure eval coverage.

View

Frequently asked questions

What does the Core Generate And Stream Text eval for Vercel AI SDK Vercel AI SDK (`ai` npm package) test?+

Evaluates Vercel AI SDK's Core: generateText / streamText across 9 scenario-based test cases, each graded against an expected-behavior rubric by an LLM judge, from Corsac's AI SDK eval coverage.

How is the Core Generate And Stream Text eval scored?+

The judge rubric: Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

How many test cases does this eval pack include?+

The Core Generate And Stream Text pack for Vercel AI SDK Vercel AI SDK (`ai` npm package) contains 9 test cases. 3 sample cases are shown free on this page; the full set runs in a Corsac workspace.

How do I run this eval?+

Sign up for Corsac, connect your model or agent endpoint, and run the Core Generate And Stream Text pack as-is or after customizing thresholds. Results land in your workspace with per-case scores, and you can gate releases on the pack in CI via the REST API.

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.