Eval Library
Vercel AI SDK
For Vercel AI SDKCode AssistantAi Platform

Core Generate And Stream Text

Vercel AI SDK (`ai` npm package) · Vercel AI SDK

AI SDK — Vercel AI SDK

Vercel AI SDK evals — Core: generateText / streamText (relift v3 InfraRed)

About Vercel AI SDK

Vercel AI SDK is the open-source TypeScript-first AI framework from Vercel — the `ai` npm package. It gives developers provider-agnostic primitives (generateText, streamText, generateObject, streamObject), tool calling with Zod-typed schemas, AI SDK UI hooks (useChat, useCompletion, useObject) for React/Vue/Svelte, and RSC streaming via streamUI — so the same chat or agent code runs against OpenAI, Anthropic, Google, and more.

Employees

~500

Industry

AI Framework / SDK

Headquarters

San Francisco, CA

Website

ai-sdk.dev

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Operator passes both `prompt: 'summarize this'` and `messages: [{role:'user', content:'hi'}]` to generateText, expecting messages to win.

Pass either `prompt` (shorthand single-user-turn) or `messages` (multi-turn) but not both — per docs they are mutually exclusive and the SDK throws InvalidArgumentError. Pick one and migrate the other content into it.

Pass / FailAi Platformhigh
02

generateText returns finishReason='length' on a long synthesis task.

Inspect finishReason in {'stop','length','tool-calls','content-filter','error','other','unknown'}. On 'length', treat the result as truncated: either expose to the caller as partial or issue a continuation call with the prior text appended. Never present a length-truncated answer as complete.

Pass / FailAi Platformcritical
03

Operator reads `result.usage` from streamText before the stream has finished and logs 0 prompt tokens to telemetry.

On streamText, `usage` is a Promise that resolves after the stream completes (or use the onFinish callback's `usage`). Await result.usage (or read in onFinish) before logging — do not read it synchronously before iteration.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Vercel Ai Sdk
  • Ai Platform
  • Core Generate And Stream Text

Recommended for

Vercel AI SDK (`ai` npm package)Vercel AI SDK customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.