Security Policies

Build layered policy rules for injection prevention, data loss protection, excessive agency detection, and tool drift monitoring.

1. Prompt Injection Prevention

ShieldAgent uses an ML classifier to detect prompt injection in tool call arguments. Block calls above a confidence threshold:

policy — injection block

{
  "tenantId": "<tenant-id>",
  "agentId": null,
  "toolName": "*",
  "action": "deny",
  "conditions": [
    {
      "type": "ml_injection_score_above",
      "threshold": "<your-threshold>"
    }
  ]
}

Set a threshold that matches your security and usability requirements. Lower values increase detection sensitivity; higher values reduce false positives.

2. Data Loss Prevention

Prevent agents from exfiltrating secrets, PII, or financial data through tool calls:

policy — block SSH key exfiltration

{
  "tenantId": "<tenant-id>",
  "toolName": "send_email",
  "action": "deny",
  "conditions": [
    {
      "type": "param_matches_pattern",
      "param": "arguments.body",
      "pattern": "-----BEGIN (RSA|EC|OPENSSH) PRIVATE KEY-----"
    }
  ]
}

policy — shadow mode for PII in write_file

{
  "tenantId": "<tenant-id>",
  "toolName": "write_file",
  "action": "shadow",
  "conditions": [
    {
      "type": "param_contains_pii",
      "sensitivity": "high"
    }
  ]
}

3. Excessive Agency Detection

Catch agents that call destructive tools at abnormal frequency within a session:

policy — rate limit destructive bash commands

{
  "tenantId": "<tenant-id>",
  "toolName": "bash",
  "action": "deny",
  "conditions": [
    {
      "type": "session_call_count_above",
      "threshold": 50,
      "window": "1h"
    }
  ]
}

Agency detection also fires automatically when an agent's excessive agency score exceeds the configured risk threshold — no per-tool policy required.

4. Tool Allowlist Pattern

The safest policy pattern: explicitly allow only the tools an agent needs, deny everything else. Use the tenant-wide deny-all + per-agent allow approach:

typescript

import ShieldAgent from '@shieldagent/sdk';

const client = new ShieldAgent();

// Implicit deny is already the default — no deny-all rule needed.
// Allow specific tools per agent:
for (const tool of ["read_file", "list_files", "search_web"]) {
  await client.policies.create({
    agentId: "<agent-id>",
    toolName: tool,
    action: "allow",
  });
}

5. Tool Drift Monitoring

Tool drift occurs when an agent starts calling tools outside its established baseline. Configure drift detection sensitivity and response:

typescript

// View tool drift events for an agent
const driftEvents = await client.auditEvents.list({
  agentId: "<agent-id>",
  eventType: "tool_drift",
  limit: 20,
});

// Block new tools until explicitly approved
await client.agents.update("<agent-id>", {
  blockToolDiscovery: true,
});

Tool drift event

{
  "eventType": "tool_drift_detected",
  "agentId": "agt_01HXYZ...",
  "toolName": "delete_database",
  "baselineCallCount": 0,
  "sessionCallCount": 3,
  "driftScore": 0.94,
  "action": "blocked",
  "timestamp": "2026-04-16T14:23:00Z"
}

Policy Templates

ShieldAgent ships with pre-built policy templates for common security scenarios. Apply them via the dashboard or API:

typescript

// List available templates
const templates = await client.policyTemplates.list();

// Apply the "EU AI Act high-risk agent" template
await client.policyTemplates.apply("<template-id>", {
  agentId: "<agent-id>",
});

OWASP Top 10 for LLMs

Covers all OWASP LLM risks: injection, data poisoning, supply chain

EU AI Act High-Risk

Full Annex IV evidence collection + human review triggers

Minimal footprint

Read-only tool allowlist + strict session limits

DevOps agent

Controlled bash/git access with destructive command blocklist