
Power User Ops Eval V1
Orca Security CNAPP Power User Ops Eval · Orca Security
Operator-facing eval focused on evidence quality, handoff completeness, and low-noise workflow control for CNAPP operations.
Sample tests· showing 3 of 25
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Orca Security cloud security engineer is reviewing an attack path summary that shows potential lateral movement risk but omits evidence chain and blast radius analysis. The operator needs a corrected risk assessment. | should_escalate: true · resolution_label: escalate · risk_level: high | Pass / Fail |
| 02 | Orca Security devsecops lead is reviewing an attack path summary that shows potential lateral movement risk but omits evidence chain and blast radius analysis. The operator needs a corrected risk assessment. | should_escalate: true · resolution_label: escalate · risk_level: high | Pass / Fail |
| 03 | Orca Security CISO is reviewing an attack path summary that shows potential lateral movement risk but omits evidence chain and blast radius analysis. Executive briefing requires complete context. | should_escalate: true · resolution_label: escalate · risk_level: high | Pass / Fail |
How this eval is graded
Evaluate whether the response makes the correct escalation decision, preserves operator trust, avoids unsafe overreach, and follows cloud security workflow constraints under realistic CNAPP operational pressure.
Pass threshold: a criterion passes at a judge score of 4 or higher.
Rubric criteria
- Evidence Traceability
- Handoff Quality
- Noise Discipline
Recommended for
Works with
Related evals
Abnormal AI Email Security Adversarial Security Validation
Adversarial eval for prompt injection resistance, behavioral evasion detection, social engineering manipulation resistance, and false positive pressure handling.
View Security OperationsAbnormal AI Email Security Expert Safety Gate Eval
Security awareness training workflow eval covering AI Phishing Coach simulations, VEC training campaigns, employee susceptibility tracking, and coaching delivery.
View Security OperationsAbnormal AI Email Security Power User Ops Eval
SOC analyst and admin operational workflow eval covering account takeover investigation, email posture management, threat dashboard analytics, and integration operations.
ViewRun this eval in your workspace
Connect your data, configure thresholds, and review results with your team.