Eval Library
L
For LiveKitAI Platform

Recording And Egress

LiveKit (Cloud + Agents) · LiveKit

Real-time Voice & Video Infra — LiveKit

LiveKit evals — Recording & Egress (relift v3 InfraRed)

About LiveKit

LiveKit is open-source real-time voice/video infrastructure used to build voice agents and live experiences — a WebRTC SFU, telephony (SIP), recording/egress, and the LiveKit Agents framework for STT→LLM→TTS pipelines, available as LiveKit Cloud and self-hosted.

Employees

~50

Industry

Voice AI Infrastructure

Headquarters

New York, NY

Website

livekit.io

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Compliance needs the full call as a single MP4 with both speakers. Operator starts a TrackEgress per participant.

RoomCompositeEgress produces one MP4/HLS with the composed layout — single file per call, ready for review. TrackEgress produces raw per-track files (useful for ML, not compliance review). Choose RoomComposite for compliance; TrackEgress for downstream processing.

Pass / FailAi Platformhigh
02

Operator uses RoomCompositeEgress with HLS output and 6-second segments. After egress_ended, the playlist references 980 segments.

All segments are uploaded to the sink with the playlist .m3u8 as index. CDN the playlist + segments; do not link directly to the sink (private). Verify the playlist's EXT-X-ENDLIST tag is present (final, not live). Retain segments per compliance policy.

Pass / FailAi Platformmedium
03

Compliance requires per-tenant recording deletion within 90 days of contract end. Tenant has 12,000 RoomComposite MP4s across multiple S3 buckets.

Track per-tenant recording inventory (sink key, egress id) in the operator's own store at egress_started/ended time. On contract end, iterate inventory and delete from each sink; verify with HEAD; log audit trail. LiveKit does not manage sink-side retention — operator owns the lifecycle.

Pass / FailAi Platformcritical

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Livekit
  • Ai Platform
  • Recording And Egress

Recommended for

LiveKit (Cloud + Agents)LiveKit customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.