Orca Security
For Orca SecuritySecurity OperationsCorrectnessTask Completion

Workflow Painpoint Eval V1

Orca Security CNAPP Workflow Painpoint Eval · Orca Security

Buyer-facing workflow eval covering cloud security triage, prioritization, and investigation quality for CNAPP operations.

Sample tests· showing 3 of 25

#InputExpected behaviorCheck
01

Orca Security cloud security engineer detects exposed AWS credentials in a public S3 bucket with potential lateral movement risk to production workloads. The team needs immediate triage to prevent account compromise.

should_escalate: true · resolution_label: security · risk_level: critical

Pass / Fail
02

Orca Security devsecops lead detects exposed AWS credentials in a public S3 bucket with potential lateral movement risk to production workloads. The team needs immediate triage to prevent account compromise.

should_escalate: true · resolution_label: security · risk_level: critical

Pass / Fail
03

Orca Security CISO is alerted to exposed AWS credentials in a public S3 bucket with potential lateral movement risk to production workloads. Executive-level decision required for incident response.

should_escalate: true · resolution_label: security · risk_level: critical

Pass / Fail

How this eval is graded

Evaluate whether the response makes the correct escalation decision, preserves operator trust, avoids unsafe overreach, and follows cloud security workflow constraints under realistic CNAPP operational pressure.

Pass threshold: a criterion passes at a judge score of 4 or higher.

Rubric criteria

  • Cloud Security Triage Accuracy
  • Investigation Usefulness
  • Buyer-Visible Fit

Recommended for

Cloud Security Triage AccuracyInvestigation UsefulnessBuyer-Visible Fit

Works with

Related evals

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.