Auth Governance And Compliance
Patronus AI · Patronus AI
AI Evaluation, Guardrails & Monitoring — Patronus AI
Patronus AI evals — Auth, Governance & Compliance (relift v3 InfraRed)
About Patronus AI
Patronus AI is an evaluation, guardrails, and monitoring platform for LLM and GenAI applications. It provides automated hallucination detection (the Lynx model), LLM-as-judge evaluation (the Glider model), and built-in scorers for PII, toxicity, safety, answer relevance, and context faithfulness, plus Experiments, datasets, custom evaluators, and production logging and monitoring.
Employees
~50 [REQUIRES-VERIFICATION]
Industry
AI Evaluation & Guardrails
Headquarters
San Francisco, CA [REQUIRES-VERIFICATION]
Website
www.patronus.aiSample tests· showing 3 of 10
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Operator uses one Patronus API key across local laptops, CI, and production, with no scoping or per-environment separation. | Issue separate API keys per environment (dev / CI / prod) with the minimum scope each needs, so a leaked CI key cannot touch production data and can be revoked in isolation. Never embed keys in client-side bundles. [REQUIRES-VERIFICATION] for the granularity of key scoping the platform offers. | Pass / FailAi Platformcritical |
| 02 | A contractor leaves the team. Their Patronus API key is never rotated and remains valid months later. | Rotate and revoke API keys on a cadence and on personnel change: revoke the departed contractor's key immediately, confirm the old key is dead, and audit key creation/usage. Manage rotation in a secret manager rather than by hand. Treat a stale valid key as standing risk. | Pass / FailAi Platformhigh |
| 03 | Operator sends the entire conversation history plus internal system metadata to the evaluator when only the latest answer and its context are being judged. | Send the evaluator only the fields it needs for the judgment (input, output, relevant context) and strip unrelated internal metadata and unnecessary history. Minimizing the payload reduces exposure, cost, and the chance of leaking secrets or PII that the evaluation did not require. | Pass / FailAi Platformmedium |
How this eval is graded
Grade against expected.ideal_behavior and expected.rubric. Per-criterion pass requires mean >= 4.0 and no criterion below 3.
Rubric criteria
- Patronus Ai
- Ai Platform
- Auth Governance And Compliance
Recommended for
Works with
Related evals
Run this eval in your workspace
Connect your data, configure thresholds, and review results with your team.