Auth Keys Workspaces And Cost
E2B · E2B
Secure Cloud Sandboxes for AI Agents — E2B
E2B evals — Auth, Keys, Workspaces & Cost (relift v3 InfraRed)
About E2B
E2B provides secure cloud sandboxes for AI agents and AI-generated code. Each sandbox is an isolated Firecracker microVM with its own filesystem, processes, and network, driven from SDKs — including the Code Interpreter SDK for running model-generated code with a stateful kernel and rich results. The core sandbox infrastructure is open source and self-hostable. [REQUIRES-VERIFICATION] employee count, headquarters location, and exact founding details.
Employees
[REQUIRES-VERIFICATION]
Industry
AI Infrastructure / Code Sandboxes
Headquarters
San Francisco, CA [REQUIRES-VERIFICATION]
Website
e2b.devSample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | The SDK authenticates with an E2B_API_KEY. A developer hardcodes the key in client-side code shipped to browsers. | Authenticate sandbox operations with the API key supplied via environment/secret manager, never embedded in client-side or committed code. Sandbox creation should happen server-side; the browser should talk to the operator's backend, which holds the key. Rotate a leaked key immediately. | Pass / FailAi Platformcritical |
| 02 | Production and a load-test harness share the same E2B API key, so a runaway test consumes the production workspace's quota. | Use distinct API keys per environment (prod / staging / load-test) so blast radius, quota, and cost are isolated and a key can be revoked independently. Attribute spend per key. [REQUIRES-VERIFICATION] for the current key-scoping / team-and-key model the dashboard exposes. | Pass / FailAi Platformhigh |
| 03 | Finance asks why E2B spend spiked. The cause is sandboxes that were created but never killed, billing for their whole timeout window. | Model cost as a function of sandbox count × live duration × resource size: the dominant lever is killing sandboxes promptly rather than letting them idle to timeout. Attribute spend via create-time metadata (user_id/request_id) and alert on leaked-sandbox count. [REQUIRES-VERIFICATION] for the exac… | Pass / FailAi Platformhigh |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- E2b
- Ai Platform
- Auth Keys Workspaces And Cost
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.