Eval Library
V
For VapiAI Platform

Squad And Workflow Routing

Vapi · Vapi

Voice AI Orchestration — Vapi

Vapi evals — Squad & Workflow Routing (relift v3 InfraRed)

About Vapi

Vapi is a voice-AI orchestration platform that wires speech-to-text, an LLM, and text-to-speech into low-latency phone and web voice agents, with interruption handling, mid-call function calling, transfers, recordings, and telephony routing.

Employees

~50

Industry

Voice AI Orchestration

Headquarters

San Francisco, CA

Website

vapi.ai

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Operator creates a squad with three assistants: triage, billing, tech-support. They omit assistantDestinations on triage. Triage never hands off.

Each squad member declares assistantDestinations[] with destination assistant id, description (why transfer here), message (what triage says before handoff), and transferMode. Without assistantDestinations the member is a leaf — no outbound transfer. Round-trip GET /squad/{id} to verify.

Pass / FailAi Platformhigh
02

Triage has destinations 'billing' (desc: 'for billing questions') and 'collections' (desc: 'for billing questions'). Customers asking about refunds land randomly on either.

Descriptions must be distinct enough for the routing model to decide. Use specific phrasing: billing → 'invoice questions, charge disputes'; collections → 'past-due balance and payment plans.' Test routing with a held-out scenario set and inspect transferReason in artifact.messages[].

Pass / FailAi Platformmedium
03

Triage transfers to billing. Billing assistant asks the customer to repeat their order number, even though they just gave it to triage.

Vapi propagates the in-call transcript / messages[] state to the destination assistant; configure the destination's system prompt to summarize prior context rather than restart. If full transcript carry-over is unclear [REQUIRES-VERIFICATION], pass key facts via messageResponse or dynamic variables…

Pass / FailAi Platformhigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Vapi
  • Ai Platform
  • Squad And Workflow Routing

Recommended for

VapiVapi customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.