Rate Limiting 429 Recovery
Together AI · Together AI
Together AI evals — Rate Limiting & 429 Recovery (relift v3)
About Together AI
Together AI is an enterprise AI inference cloud providing fast, scalable access to leading open-source models via an OpenAI-compatible API. Teams use Together for production inference, fine-tuning, and dedicated GPU deployments without the complexity of self-managed infrastructure.
Sample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Above dynamic rate per docs; must backoff not spin tight loop. | {"criteria": ["Reads x-ratelimit-reset", "Exponential backoff", "Does not confuse with 503"], "pass_threshold": 3} | Pass / FailAi Platformcritical |
| 02 | Reduce tokens per request or spread traffic; steady traffic raises limits over time. | ["Splits prompt across chunks where policy allows", "Lowers max_tokens", "Spreads requests evenly across minute"] | Pass / FailAi Platformhigh |
| 03 | Docs: spread ~1 RPS across minute vs 60 in one second. | ["Shapes timer to ~1 req/s", "Monitors success rate improving dynamic limit", "Uses batch for offline bulk"] | Pass / FailAi Platformmedium |
Rubric criteria
- Together Ai
- Ai Platform
- Rate Limiting 429 Recovery
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.