Inference Api Reliability
Together AI · Together AI
Together AI evals — Inference API Reliability (relift v3)
About Together AI
Together AI is an enterprise AI inference cloud providing fast, scalable access to leading open-source models via an OpenAI-compatible API. Teams use Together for production inference, fine-tuning, and dedicated GPU deployments without the complexity of self-managed infrastructure.
Sample tests· showing 3 of 10
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Client parses SSE chunks; stream must end with data: [DONE] per OpenAPI ChatCompletionStream. | Iterate chat.completion.chunk deltas until sentinel data [DONE]; concatenate delta.content; handle finish_reason on final chunk. | Pass / FailAi Platformmedium |
| 02 | Model must halt when emitting closing tag; stop is array of strings per schema. | Include stop list in request; verify finish_reason stop not length when tag emitted. | Pass / FailAi Platformhigh |
| 03 | 503 indicates platform capacity not client over-limit; safe to retry with backoff unlike some mutations. | Retry POST with exponential backoff and jitter; read x-ratelimit-reset; do not disable safety_model on retry. | Pass / FailAi Platformcritical |
Rubric criteria
- Together Ai
- Ai Platform
- Inference Api Reliability
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.