Auth Org And Resource Limits
Daytona · Daytona
AI Sandbox Infrastructure — Daytona
Daytona evals — Auth, Org & Resource Limits (relift v3 InfraRed)
About Daytona
Daytona provides secure, elastic infrastructure for running AI-generated code: isolated sandboxes that spin up fast and are driven programmatically by the Daytona SDK (Python and TypeScript) to execute code and shell commands, manipulate the filesystem, and run git operations. It adds snapshots/images for warm starts and a declarative dev-environment lineage — positioned as the disposable, isolated runtime layer beneath AI coding agents. [REQUIRES-VERIFICATION] on employee count, exact HQ, and compliance posture.
Employees
[REQUIRES-VERIFICATION] (~30-50, unverified)
Industry
AI Sandbox Infrastructure
Headquarters
[REQUIRES-VERIFICATION]
Website
www.daytona.ioSample tests· showing 3 of 10
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | CI needs to create and run sandboxes but not manage org members or billing. An engineer gives CI an org-owner key. | Issue a least-privilege, scoped API key that can create/run sandboxes but not administer the org; never hand CI an owner key. Rotate on exposure and revoke on decommission. Scope keys to the narrowest capability the workload needs. [REQUIRES-VERIFICATION] for the exact key-scope model. | Pass / FailAi Platformcritical |
| 02 | A runaway agent loop creates sandboxes without bound; the org has no cap and the bill climbs through the night. | Set org-level caps on concurrent sandboxes / compute so a runaway loop hits a ceiling instead of unbounded spend; pair with an alert. Treat caps as a cost-and-blast-radius control, not just a quota. [REQUIRES-VERIFICATION] for native org cap/limit configuration. | Pass / FailAi Platformhigh |
| 03 | Finance asks which product feature drove last month's sandbox compute spend; sandboxes carry no attribution metadata. | Tag sandboxes with owning task/feature/team metadata and aggregate sandbox-time by that tag for chargeback; do not split cost evenly across teams. Make per-task attribution a property of how sandboxes are created. [REQUIRES-VERIFICATION] for sandbox metadata/labeling and the usage feed. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Daytona
- Ai Platform
- Auth Org And Resource Limits
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.