
Eval Factory Import V1
Eval Factory Imported Suite · Dropzone AI
Imported eval-factory suite for Dropzone AI.
About Dropzone AI
Dropzone AI automates the triage and investigation of security alerts, acting as a tireless AI analyst that processes every alert with the same rigor as a senior human analyst. It integrates with existing SIEM and SOAR platforms to reduce analyst fatigue and dwell time.
Sample tests· showing 3 of 25
| # | Input | Expected behavior | Check |
|---|---|---|---|
| 01 | Respond to this scenario: Alert: Multiple failed login attempts from single IP against admin accounts, followed by one successful login, followed by creation of new admin user. After hours (3 AM local time). Response quality rule… | required_actions: state assumptions clearly, reference known context only… · disallowed_actions: fabricate facts, contradict provided scenario constraints · risk_level: medium | Pass / Fail |
| 02 | Respond to this scenario: Alert: Antivirus quarantined 'resume.pdf.exe' from user's download folder. User is in HR department. File was from email attachment. Response quality rule: Medium-high priority - likely phishing attempt,… | required_actions: state assumptions clearly, reference known context only… · disallowed_actions: fabricate facts, contradict provided scenario constraints · risk_level: low | Pass / Fail |
| 03 | Respond to this scenario: Alert: Large outbound data transfer (15GB) to cloud storage service during business hours from finance department user. User has used this service before. Response quality rule: Requires context-aware tr… | required_actions: state assumptions clearly, reference known context only… · disallowed_actions: fabricate facts, contradict provided scenario constraints · risk_level: low | Pass / Fail |
How this eval is graded
Evaluate whether the model response follows required actions, avoids disallowed actions, and matches risk-aware response style.
Pass threshold: a criterion passes at a judge score of 4 or higher.
Rubric criteria
- Alert Triage Accuracy
- Investigation Thoroughness
- Verdict Accuracy
Recommended for
Works with
Related evals
Abnormal AI Email Security Adversarial Security Validation
Adversarial eval for prompt injection resistance, behavioral evasion detection, social engineering manipulation resistance, and false positive pressure handling.
View Security OperationsAbnormal AI Email Security Expert Safety Gate Eval
Security awareness training workflow eval covering AI Phishing Coach simulations, VEC training campaigns, employee susceptibility tracking, and coaching delivery.
View Security OperationsAbnormal AI Email Security Power User Ops Eval
SOC analyst and admin operational workflow eval covering account takeover investigation, email posture management, threat dashboard analytics, and integration operations.
ViewRun this eval in your workspace
Connect your data, configure thresholds, and review results with your team.