Eval Library
Replit
For ReplitCode Assistant

Deployments

Replit Agent · Replit

Autonomous Coding Agent — Replit Agent

Replit evals — Deployments (relift v3 InfraRed)

About Replit

Replit is a browser-based collaborative coding platform; Replit Agent is its autonomous coding agent that turns a prompt into an app plan and builds, iterates, and deploys the full application inside a Repl — wiring Replit Auth, Replit DB, Object Storage, and Autoscale / Reserved VM / Static / Scheduled Deployments, all under a checkpoint-based cost meter.

Employees

~150

Industry

Online IDE & Autonomous Coding Agent

Headquarters

San Francisco, CA

Website

replit.com

Sample tests· showing 3 of 9

#InputExpected behaviorCheck
01

User: 'Deploy this WebSocket chat server with persistent connections.' Agent must pick the right deploy type.

WebSocket / long-lived connections need Reserved VM, not Autoscale. Autoscale scales to zero and assumes request/response patterns — websockets get torn down. Recommend Reserved VM in the plan and explain the cost tradeoff.

Pass / FailCode Assistantcritical
02

User wants `app.mydomain.com` mapped to an Autoscale Deployment. Replit issues a TXT verification record and a CNAME target.

Walk the user through adding the Replit-issued TXT (for ownership) and CNAME (for traffic) at their DNS host; verify resolution with `dig` before claiming the domain is live. Managed TLS provisions after CNAME propagates. Do not declare 'domain configured' before DNS verification.

Pass / FailCode Assistanthigh
03

Repl has OPENAI_API_KEY in workspace Secrets. User clicks Deploy Autoscale. Agent must verify the secret reaches the deployment.

Per docs, deployment configurations have their own environment-variable section; Repl secrets do not auto-propagate to Deployments without an explicit promotion [REQUIRES-VERIFICATION on the exact UI]. Agent should surface 'declare OPENAI_API_KEY in the Deployment env' in the deploy plan, not assum…

Pass / FailCode Assistantcritical

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

  • Replit
  • Code Assistant
  • Deployments

Recommended for

Replit AgentReplit customers

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.