Risk Scoring

Understand how ShieldAgent computes per-agent risk scores and configure enforcement thresholds.

How Risk Scores Work

ShieldAgent computes a continuous risk score (0–100) for each agent based on behavioral signals. Scores update within 100 ms of each tool call; rolling 7-day window with time-weighted decay.

Score Composition

Injection scoreML classifier confidence that inputs contain prompt injection

Tool drift scoreDeviation from the agent's established tool-use baseline

Excessive agency scoreFrequency and scope of high-impact tool calls per session

Anomaly scoreStatistical deviation from peer agent behavior

Policy violation rateProportion of requests that triggered a deny rule

Score Tiers

Score Range	Tier	Default Action
0 – 59	Normal	No restrictions. Full throughput.
60 – 79	Elevated	Alert on threshold crossings.
80 – 89	High	Rate-limited. Forced into monitoring mode.
90 – 100	Critical	Auto-block + immediate alert. Manual release required.

Query Risk Scores

typescript

import ShieldAgent from '@shieldagent/sdk';

const client = new ShieldAgent();

// Get current risk scores for all agents
const scores = await client.risk.list();

// Get risk history for a specific agent (last 24h)
const history = await client.agents.getRiskHistory("agt_...", { window: "24h" });

Risk score response

{
  "agentId": "agt_01HXYZ...",
  "score": 67,
  "tier": "elevated",
  "securityScore": 71,
  "complianceScore": 63,
  "trend": "increasing",
  "trendDelta": 14,
  "updatedAt": "2026-04-16T14:23:00Z"
}

Configure Enforcement Thresholds

Override global thresholds per tenant or per agent. When a score crosses a threshold, ShieldAgent switches from monitor to block mode automatically.

typescript

// Set per-tenant risk thresholds
await client.risk.updateConfig({
  elevatedThreshold: 60,
  highThreshold: 80,
  criticalThreshold: 90,
});

autoLockThreshold — when crossed, the agent is suspended until a human reviewer un-locks it. Events are written to the audit trail.

Behavior Baselines

ShieldAgent learns each agent's normal tool-use pattern over a configurable warm-up period. Deviations from the baseline increase the tool drift score.

typescript

// View baseline for an agent
const baseline = await client.risk.getBaseline("agt_...");

// Reset baseline (e.g., after an agent update changes its behavior)
await client.risk.resetBaseline("agt_...");

Baseline warm-up period

Configure with BASELINE_WARMUP_DAYS . During warm-up, tool drift scoring is adjusted to avoid false positives on new agents.

Risk Alerts

Configure alert rules to trigger on risk score thresholds:

typescript

// Create an alert rule for high-risk agents
await client.alerts.createRule({
  name: "High risk agent alert",
  condition: "riskScore >= 70",
  severity: "high",
  channels: ["webhook", "email"],
  webhookUrl: "https://hooks.slack.com/...",
});