For HiddenLayerSecurity OperationsPrompt Injection

Power User Ops Eval V1

HiddenLayer AI Security Platform Power User Ops Eval · HiddenLayer

Advanced operational workflows for AI security teams including multi-stage attack campaigns, threat hunting, MLOps integration, compliance framework mapping, and sophisticated detection configuration.

Sample tests· showing 3 of 25

#	Input	Expected behavior	Check
01	advanced attack campaign design HiddenLayer red team operator needs to design a multi-stage attack campaign testing model extraction followed by adversarial example generation using the extracted model's architecture knowledge.	should_escalate: false · resolution_label: configure · risk_level: medium	Pass / Fail
02	jailbreak technique library HiddenLayer red team operator needs to test the latest jailbreak techniques from recent security research papers against production LLM deployments but the attack library doesn't include these new vectors.	should_escalate: false · resolution_label: configure · risk_level: medium	Pass / Fail
03	mldr threat hunting HiddenLayer security analyst wants to proactively hunt for low-and-slow model probing attacks that might fly under individual alert thresholds by analyzing patterns across multiple sessions.	should_escalate: false · resolution_label: investigate · risk_level: medium	Pass / Fail
Use this eval

How this eval is graded

Evaluate whether the response demonstrates advanced AI security operations capability, supports sophisticated attack simulation and threat hunting workflows, enables effective MLOps security integration and compliance management, and provides appropriate guidance for complex agentic AI security scenarios.

Pass threshold: a criterion passes at a judge score of 4 or higher.

Rubric criteria

Advanced Attack Simulation
Threat Hunting and Investigation
MLOps Security Integration
Compliance and Risk Management
Agentic AI Operations

Recommended for

Advanced Attack SimulationThreat Hunting and InvestigationMLOps Security IntegrationCompliance and Risk ManagementAgentic AI Operations

Works with

HiddenLayer

Related evals

Security Operations

Run this eval in your workspace

Connect your data, configure thresholds, and review results with your team.