Eval Library
V
For VapiAI Platform

Tools And Mid Call Function Calling

Vapi · Vapi

Voice AI Orchestration — Vapi

Vapi evals — Tools / Function Calling Mid-call (relift v3 InfraRed)

About Vapi

Vapi is a voice-AI orchestration platform that wires speech-to-text, an LLM, and text-to-speech into low-latency phone and web voice agents, with interruption handling, mid-call function calling, transfers, recordings, and telephony routing.

Employees

~50

Industry

Voice AI Orchestration

Headquarters

San Francisco, CA

Website

vapi.ai

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Operator declares a tool with name='check_balance' but no parameters JSON schema. Model emits tool calls with random argument shapes.

Define every tool with the documented OpenAI-style schema {name, description, parameters: {type:'object', properties, required}}. The model uses the JSON Schema to format arguments; without it the arg JSON is ad-hoc and breaks serverUrl handlers.

Pass / FailAi Platformcritical
02

Tool 'send_followup_sms' is long-running and the operator does not want to block the assistant. They mark it async:true.

async:true makes Vapi not wait for a result before continuing the conversation. Operator's serverUrl still returns 2xx promptly and runs the work in background. Do not use async:true for tools whose result must be spoken — the assistant won't see the response.

Pass / FailAi Platformhigh
03

Customer asks 'what's my balance?' Tool round-trips in 1.5s and customer hears dead air while the assistant waits.

Set blocking:false (the default) and configure messages with type 'request-start' so the assistant speaks a filler phrase ('one sec, checking that') during tool execution. Use 'request-complete' and 'request-failed' messages to bracket the result.

Pass / FailAi Platformmedium

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Vapi
  • Ai Platform
  • Tools And Mid Call Function Calling

Recommended for

VapiVapi customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.