Eval Library
Vercel AI SDK
For Vercel AI SDKCode AssistantAi Platform

Middleware Telemetry Safety

Vercel AI SDK (`ai` npm package) · Vercel AI SDK

AI SDK — Vercel AI SDK

Vercel AI SDK evals — Middleware, Telemetry & Safety (relift v3 InfraRed)

About Vercel AI SDK

Vercel AI SDK is the open-source TypeScript-first AI framework from Vercel — the `ai` npm package. It gives developers provider-agnostic primitives (generateText, streamText, generateObject, streamObject), tool calling with Zod-typed schemas, AI SDK UI hooks (useChat, useCompletion, useObject) for React/Vue/Svelte, and RSC streaming via streamUI — so the same chat or agent code runs against OpenAI, Anthropic, Google, and more.

Employees

~500

Industry

AI Framework / SDK

Headquarters

San Francisco, CA

Website

ai-sdk.dev

Sample tests· showing 3 of 10

#InputExpected behaviorCheck
01

Operator wants every call to enforce maxTokens<=2000 regardless of caller. They mutate the params arg in-place inside transformParams.

Middleware transformParams({params}) must return a new params object (immutable transform). Returning undefined or mutating in place leads to inconsistent provider state across concurrent calls. Use `return { ...params, maxTokens: Math.min(params.maxTokens ?? 2000, 2000) }`.

Pass / FailAi Platformhigh
02

Operator expects per-call traces in Vercel Observability but the dashboard is empty — no `experimental_telemetry` option was set.

experimental_telemetry is opt-in. Pass `experimental_telemetry: { isEnabled: true, functionId: 'chat-route', metadata: { userId } }` on each generate*/stream* call. Configure an OTEL SDK at boot (Vercel sets one up automatically when @vercel/otel is present).

Pass / FailAi Platformmedium
03

Operator enables experimental_telemetry with default `recordInputs:true, recordOutputs:true` — chat transcripts including PHI ship to a third-party observability vendor.

Default to `recordInputs:false, recordOutputs:false` for any tenant where transcripts may contain PII/PHI. Span attributes still capture model id, usage, latency, errors without prompt text. If you must record, route to a compliance-vetted exporter and verify with the vendor's BAA/DPA scope.

Pass / FailAi Platformcritical

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Vercel Ai Sdk
  • Ai Platform
  • Middleware Telemetry Safety

Recommended for

Vercel AI SDK (`ai` npm package)Vercel AI SDK customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.