Webhooks And Events
Vapi · Vapi
Voice AI Orchestration — Vapi
Vapi evals — Webhooks & Events (relift v3 InfraRed)
About Vapi
Vapi is a voice-AI orchestration platform that wires speech-to-text, an LLM, and text-to-speech into low-latency phone and web voice agents, with interruption handling, mid-call function calling, transfers, recordings, and telephony routing.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | serverUrl handler for assistant-request does a cold CRM lookup that takes 9 seconds. Calls drop before the assistant even speaks. | assistant-request must complete within the telephony provider's connect window (~7.5s end-to-end per docs). Pre-warm CRM lookups (async cache, in-memory) or return a generic assistant immediately and customize on the first tool-call. Profile assistant-request P95 and alert when it nears 5s. | Pass / FailAi Platformcritical |
| 02 | Operator tracks call state in their own DB. They update only on end-of-call-report. Live dashboards lag by minutes. | Persist status-update events (queued, ringing, in-progress, forwarding, ended) as they arrive so live dashboards reflect call state in real time. end-of-call-report is the terminal artifact, not the lifecycle source of truth. Reconcile by call id. | Pass / FailAi Platformmedium |
| 03 | Operator's analytics job fetches artifact.recordingUrl from end-of-call-report two weeks later. URL returns 404. | Recording URL availability is tenant-configurable; download and persist to the operator's own object store on first end-of-call-report delivery. Recording retention defaults are tenant/plan-specific [REQUIRES-VERIFICATION] — do not rely on the URL being live indefinitely. | Pass / FailAi Platformhigh |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Vapi
- Ai Platform
- Webhooks And Events
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.