Antithesis
For AntithesisAI Platform

Test Composer And Workloads

Antithesis · Antithesis

Autonomous Deterministic Testing — Antithesis

Antithesis evals — Test Composer & Workload Drivers (relift v3 InfraRed)

About Antithesis

Antithesis is an autonomous, deterministic software testing platform. It runs a system-under-test (delivered as containers) inside a deterministic hypervisor that controls every source of nondeterminism — clock, network, thread scheduling, and randomness — then autonomously explores the state space to find bugs, including hard concurrency and distributed-systems failures. Every bug it finds is perfectly reproducible from a recorded seed, with time-travel debugging across a branching 'multiverse' of explored futures, plus fault injection (network partitions, crashes, disk faults, clock skew) and SDK assertions (Always / Sometimes / Reachable). [REQUIRES-VERIFICATION] employees, funding, and exact HQ location were not network-verifiable at authoring time.

Employees

[REQUIRES-VERIFICATION]

Industry

Software Testing / Developer Tools

Headquarters

United States [REQUIRES-VERIFICATION]

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

Operator writes a workload that always issues the same 100 requests in the same order, expecting Antithesis to find bugs.

Write the workload as a parameterized driver that exposes choices (which operation, which key, which value) to the platform so the explorer can vary them across runs, rather than a single hardcoded script. The platform's value comes from autonomously searching the space of behaviors; a fixed script…

Pass / FailAi Platformhigh
02

The platform may run a driver command multiple times across branches. The driver appends to a shared external log on each invocation.

Make drivers safe to run repeatedly and across branched futures: avoid side effects outside the simulated SUT, and key any in-SUT effects so re-execution is well-defined. Branching/replay can execute the same command from a shared prefix, so non-idempotent external side effects corrupt results.

Pass / FailAi Platformmedium
03

Operator wants a writer workload and an independent reader workload to run concurrently against the SUT.

Compose multiple independent test commands/drivers that the platform can run and interleave (the 'test composer' model), letting the simulator schedule them against each other to surface concurrency bugs. Keep each driver self-contained and idempotent so the composer can run them in any order/paral…

Pass / FailAi Platformhigh

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Antithesis
  • Ai Platform
  • Test Composer And Workloads

Recommended for

AntithesisAntithesis customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.