Daytona Sdk And Client
Daytona · Daytona
AI Sandbox Infrastructure — Daytona
Daytona evals — Daytona SDK & Client (relift v3 InfraRed)
About Daytona
Daytona provides secure, elastic infrastructure for running AI-generated code: isolated sandboxes that spin up fast and are driven programmatically by the Daytona SDK (Python and TypeScript) to execute code and shell commands, manipulate the filesystem, and run git operations. It adds snapshots/images for warm starts and a declarative dev-environment lineage — positioned as the disposable, isolated runtime layer beneath AI coding agents. [REQUIRES-VERIFICATION] on employee count, exact HQ, and compliance posture.
Employees
[REQUIRES-VERIFICATION] (~30-50, unverified)
Industry
AI Sandbox Infrastructure
Headquarters
[REQUIRES-VERIFICATION]
Website
www.daytona.ioSample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Engineer initializes the Daytona client and hardcodes the API key string in the agent source so 'it just works' in CI. | Load the Daytona API key from an environment variable / secret manager at runtime; never commit it to source. Scope the key to the minimum org permissions the agent needs and rotate on exposure. [REQUIRES-VERIFICATION] for the exact env var name the SDK reads. | Pass / FailAi Platformcritical |
| 02 | Agent constructs a brand-new Daytona client object on every single SDK call inside a tight loop. | Construct one configured client and reuse it across calls; repeated construction wastes connection setup and may re-auth each time. Treat the client as a long-lived, thread-safe handle if the SDK documents it as such — otherwise pool per worker. [REQUIRES-VERIFICATION] for the client's documented t… | Pass / FailAi Platformmedium |
| 03 | A code-exec call returns a non-zero exit code from the user's program, while a separate call raises an SDK-level exception (e.g. sandbox not found). | Distinguish the two failure planes: a non-zero exit code is the user program's result (inspect stdout/stderr, do not retry the platform), whereas an SDK/transport exception is a platform-layer failure (sandbox gone, auth, network) that may warrant restart or backoff. Branch on which plane failed be… | Pass / FailAi Platformhigh |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Daytona
- Ai Platform
- Daytona Sdk And Client
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.