Security Policies
Build layered policy rules for injection prevention, data loss protection, excessive agency detection, and tool drift monitoring.
1. Prompt Injection Prevention
ShieldAgent uses an ML classifier to detect prompt injection in tool call arguments. Block calls above a confidence threshold:
{
"tenantId": "<tenant-id>",
"agentId": null,
"toolName": "*",
"action": "deny",
"conditions": [
{
"type": "ml_injection_score_above",
"threshold": "<your-threshold>"
}
]
}Set a threshold that matches your security and usability requirements. Lower values increase detection sensitivity; higher values reduce false positives.
2. Data Loss Prevention
Prevent agents from exfiltrating secrets, PII, or financial data through tool calls:
{
"tenantId": "<tenant-id>",
"toolName": "send_email",
"action": "deny",
"conditions": [
{
"type": "param_matches_pattern",
"param": "arguments.body",
"pattern": "-----BEGIN (RSA|EC|OPENSSH) PRIVATE KEY-----"
}
]
}{
"tenantId": "<tenant-id>",
"toolName": "write_file",
"action": "shadow",
"conditions": [
{
"type": "param_contains_pii",
"sensitivity": "high"
}
]
}3. Excessive Agency Detection
Catch agents that call destructive tools at abnormal frequency within a session:
{
"tenantId": "<tenant-id>",
"toolName": "bash",
"action": "deny",
"conditions": [
{
"type": "session_call_count_above",
"threshold": 50,
"window": "1h"
}
]
}Agency detection also fires automatically when an agent's excessive agency score exceeds the configured risk threshold — no per-tool policy required.
4. Tool Allowlist Pattern
The safest policy pattern: explicitly allow only the tools an agent needs, deny everything else. Use the tenant-wide deny-all + per-agent allow approach:
import ShieldAgent from '@shieldagent/sdk';
const client = new ShieldAgent();
// Implicit deny is already the default — no deny-all rule needed.
// Allow specific tools per agent:
for (const tool of ["read_file", "list_files", "search_web"]) {
await client.policies.create({
agentId: "<agent-id>",
toolName: tool,
action: "allow",
});
}5. Tool Drift Monitoring
Tool drift occurs when an agent starts calling tools outside its established baseline. Configure drift detection sensitivity and response:
// View tool drift events for an agent
const driftEvents = await client.auditEvents.list({
agentId: "<agent-id>",
eventType: "tool_drift",
limit: 20,
});
// Block new tools until explicitly approved
await client.agents.update("<agent-id>", {
blockToolDiscovery: true,
});{
"eventType": "tool_drift_detected",
"agentId": "agt_01HXYZ...",
"toolName": "delete_database",
"baselineCallCount": 0,
"sessionCallCount": 3,
"driftScore": 0.94,
"action": "blocked",
"timestamp": "2026-04-16T14:23:00Z"
}Policy Templates
ShieldAgent ships with pre-built policy templates for common security scenarios. Apply them via the dashboard or API:
// List available templates
const templates = await client.policyTemplates.list();
// Apply the "EU AI Act high-risk agent" template
await client.policyTemplates.apply("<template-id>", {
agentId: "<agent-id>",
});OWASP Top 10 for LLMs
Covers all OWASP LLM risks: injection, data poisoning, supply chain
EU AI Act High-Risk
Full Annex IV evidence collection + human review triggers
Minimal footprint
Read-only tool allowlist + strict session limits
DevOps agent
Controlled bash/git access with destructive command blocklist