For Patronus AIAI Platform

Evaluation Api And Sdk

Patronus AI · Patronus AI

AI Evaluation, Guardrails & Monitoring — Patronus AI

Patronus AI evals — Evaluation API & SDK (relift v3 InfraRed)

About Patronus AI

Patronus AI is an evaluation, guardrails, and monitoring platform for LLM and GenAI applications. It provides automated hallucination detection (the Lynx model), LLM-as-judge evaluation (the Glider model), and built-in scorers for PII, toxicity, safety, answer relevance, and context faithfulness, plus Experiments, datasets, custom evaluators, and production logging and monitoring.

Employees

~50 [REQUIRES-VERIFICATION]

Industry

AI Evaluation & Guardrails

Headquarters

San Francisco, CA [REQUIRES-VERIFICATION]

Website

www.patronus.ai

Sample tests· showing 3 of 9

#	Input	Expected behavior	Check
01	Operator calls the Patronus evaluate API with an evaluator id, the model input, the model output, and (for RAG evaluators) retrieved_context. The response carries a per-evaluator result with a boolean pass, a numeric score, and a…	Read the result as a structured object: branch on the boolean `pass` for gating, surface the numeric `score` for trend dashboards, and persist the `explanation` for human review. Do not treat the explanation free-text as the machine-readable verdict. Tie each result back to the evaluator id that pr…	Pass / FailAi Platformhigh
02	Operator needs to evaluate 5,000 logged responses nightly. They loop and fire one synchronous evaluate call per row with no concurrency control.	Use the batch/async evaluation path (or bounded concurrency with backoff) for bulk scoring rather than a tight synchronous loop. Cap in-flight requests, honor rate-limit responses, and checkpoint progress so a crash mid-batch resumes instead of re-scoring (and re-billing) everything. [REQUIRES-VERI…	Pass / FailAi Platformmedium
03	Engineer initializes the Patronus Python SDK with the API key pasted as a literal in the source file to 'get it working', then commits it.	Initialize the SDK with the API key sourced from an environment variable or a secret manager, never a literal in code. Confirm the client picks up the key from the documented env var (e.g. a PATRONUS_API_KEY-style variable) rather than hardcoding. If a key is ever committed, rotate it and treat git…	Pass / FailAi Platformcritical
Use this eval

How this eval is graded

Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.

Rubric criteria

Patronus Ai
Ai Platform
Evaluation Api And Sdk

Recommended for

Patronus AIPatronus AI customers

Works with

Patronus AI

Related evals

AI Platform

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.

Evaluation Api And Sdk

About Patronus AI

Sample tests· showing 3 of 9

How this eval is graded

Rubric criteria

Recommended for

Works with

Related evals

Claude API

Claude API

Claude API