Auth Keys Rate Limits Concurrency
Cartesia (Sonic) · Cartesia
Voice AI — Cartesia
Cartesia evals — Auth, Keys, Rate Limits & Concurrency (relift v3 InfraRed)
About Cartesia
Cartesia builds real-time generative voice — its Sonic model delivers ultra-low-latency, high-fidelity text-to-speech with streaming, voice cloning, and prosody control for production voice agents and interactive audio experiences.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Agent embeds the Cartesia API key in client-side browser JS to call /tts directly from the user's browser. | Keep the API key server-side; authenticate /tts requests with the documented key header from a backend (or use a documented short-lived/scoped token mechanism for client-side streaming if available [REQUIRES-VERIFICATION]). Never ship the long-lived key to the browser. | Pass / FailAi Platformcritical |
| 02 | Agent omits the Cartesia-Version header (or pins a very old date) and breaks when response shapes evolve. | Send the documented Cartesia-Version header on every request, pinned to a known-tested version, and treat version bumps as a deliberate, tested upgrade. Do not omit it (relying on an implicit default) nor pin a stale value indefinitely [REQUIRES-VERIFICATION on the current version value]. | Pass / FailAi Platformhigh |
| 03 | The integration treats 401, 403, and 429 identically as 'auth error' and retries all three the same way. | Distinguish: 401 (bad/missing key — fix credentials, do not retry blindly), 403 (authorized but not permitted — surface, do not retry), 429 (rate limited — back off and retry). Route each to the correct handler instead of one generic retry. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Cartesia
- Ai Platform
- Auth Keys Rate Limits Concurrency
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.