Eval directory
Evals for Together AI
8 evaluation packs covering adversarial robustness, safety gates, workflow quality, and operator-level checks for Together AI AI products.
About Together AI
Together AI is an enterprise AI inference cloud providing fast, scalable access to leading open-source models via an OpenAI-compatible API. Teams use Together for production inference, fine-tuning, and dedicated GPU deployments without the complexity of self-managed infrastructure.
Available eval packs for Together AI
8 packs ready to run.
Billing Token Metering
Together AI evals — Billing & Token Metering (relift v3)
Dedicated Endpoints Capacity
Together AI evals — Dedicated Endpoints & Capacity (relift v3)
Fine Tuning Job Lifecycle
Together AI evals — Fine-Tuning Job Lifecycle (relift v3)
Inference Api Reliability
Together AI evals — Inference API Reliability (relift v3)
Model Catalog Routing
Together AI evals — Model Catalog & Routing (relift v3)
Multi Modal Vision Inputs
Together AI evals — Multi-Modal Vision Inputs (relift v3)
Rate Limiting 429 Recovery
Together AI evals — Rate Limiting & 429 Recovery (relift v3)
Safety Guardrails Refusal
Together AI evals — Safety Guardrails & Refusal (relift v3)
Why eval Together AI AI
Together AI's AI features ship behind brand promises about accuracy, safety, and reliability. Buyers and integrators need to know those promises hold up under adversarial prompts, edge-case workflows, and the long tail of real customer inputs — not just the demo path.
The Corsac eval library for Together AI measures four dimensions teams care about most when deploying ai platform agents:
- Adversarial robustness — does the agent resist prompt injection, jailbreaks, and social-engineering attempts?
- Workflow quality— does it complete the task buyers were shown in the demo, on inputs that don't look like the demo?
- Safety gates — does it escalate or refuse when it should, and only then?
- Operator quality — does it preserve analyst trust by surfacing the right context at the right time?
Every eval pack above is hand-authored against Together AI's public product surface and runnable in Corsac with your own data.