
Sut Setup Containers
Antithesis · Antithesis
Autonomous Deterministic Testing — Antithesis
Antithesis evals — System-Under-Test Setup (Containers) (relift v3 InfraRed)
About Antithesis
Antithesis is an autonomous, deterministic software testing platform. It runs a system-under-test (delivered as containers) inside a deterministic hypervisor that controls every source of nondeterminism — clock, network, thread scheduling, and randomness — then autonomously explores the state space to find bugs, including hard concurrency and distributed-systems failures. Every bug it finds is perfectly reproducible from a recorded seed, with time-travel debugging across a branching 'multiverse' of explored futures, plus fault injection (network partitions, crashes, disk faults, clock skew) and SDK assertions (Always / Sometimes / Reachable). [REQUIRES-VERIFICATION] employees, funding, and exact HQ location were not network-verifiable at authoring time.
Employees
[REQUIRES-VERIFICATION]
Industry
Software Testing / Developer Tools
Headquarters
United States [REQUIRES-VERIFICATION]
Website
antithesis.comSample tests· showing 3 of 9
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Operator onboards a distributed key-value store with three server nodes and a client. They describe the topology to Antithesis. | Express the full multi-container topology (all server nodes, dependencies, and the test/workload container) as a Docker Compose configuration so Antithesis can run the whole system inside the simulation. Every component the system needs at runtime must be a container in that compose set — nothing a… | Pass / FailAi Platformcritical |
| 02 | The compose file references images by tag ':latest'. The operator wants stable, reproducible test runs. | Pin every SUT image by immutable digest (sha256), not a mutable tag, so each run tests a known build and a reproduction months later rebuilds the identical environment. Tag mutation silently changes what is under test and breaks the reproducibility guarantee that is the whole point of the platform. | Pass / FailAi Platformhigh |
| 03 | The SUT assumes 32 GB RAM available on the host. Under simulation it OOMs and the operator blames Antithesis. | Declare the SUT's resource expectations in the container configuration and size the simulated environment accordingly; the simulation provides resources to the containers, it does not inherit the host's full capacity by assumption. Right-size memory/CPU per container so OOMs reflect real SUT behavi… | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Antithesis
- Ai Platform
- Sut Setup Containers
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.