Skip to main content
Sign in →

Risk Scoring

Understand how ShieldAgent computes per-agent risk scores and configure enforcement thresholds.

How Risk Scores Work

ShieldAgent computes a continuous risk score (0–100) for each agent based on behavioral signals. Scores update within 100 ms of each tool call; rolling 7-day window with time-weighted decay.

Score Composition

Injection scoreML classifier confidence that inputs contain prompt injection
Tool drift scoreDeviation from the agent's established tool-use baseline
Excessive agency scoreFrequency and scope of high-impact tool calls per session
Anomaly scoreStatistical deviation from peer agent behavior
Policy violation rateProportion of requests that triggered a deny rule

Score Tiers

Score RangeTierDefault Action
0 – 59NormalNo restrictions. Full throughput.
60 – 79ElevatedAlert on threshold crossings.
80 – 89HighRate-limited. Forced into monitoring mode.
90 – 100CriticalAuto-block + immediate alert. Manual release required.

Query Risk Scores

typescript
import ShieldAgent from '@shieldagent/sdk';

const client = new ShieldAgent();

// Get current risk scores for all agents
const scores = await client.risk.list();

// Get risk history for a specific agent (last 24h)
const history = await client.agents.getRiskHistory("agt_...", { window: "24h" });
Risk score response
{
  "agentId": "agt_01HXYZ...",
  "score": 67,
  "tier": "elevated",
  "securityScore": 71,
  "complianceScore": 63,
  "trend": "increasing",
  "trendDelta": 14,
  "updatedAt": "2026-04-16T14:23:00Z"
}

Configure Enforcement Thresholds

Override global thresholds per tenant or per agent. When a score crosses a threshold, ShieldAgent switches from monitor to block mode automatically.

typescript
// Set per-tenant risk thresholds
await client.risk.updateConfig({
  elevatedThreshold: 60,
  highThreshold: 80,
  criticalThreshold: 90,
});

autoLockThreshold — when crossed, the agent is suspended until a human reviewer un-locks it. Events are written to the audit trail.

Behavior Baselines

ShieldAgent learns each agent's normal tool-use pattern over a configurable warm-up period. Deviations from the baseline increase the tool drift score.

typescript
// View baseline for an agent
const baseline = await client.risk.getBaseline("agt_...");

// Reset baseline (e.g., after an agent update changes its behavior)
await client.risk.resetBaseline("agt_...");

Baseline warm-up period

Configure with BASELINE_WARMUP_DAYS . During warm-up, tool drift scoring is adjusted to avoid false positives on new agents.

Risk Alerts

Configure alert rules to trigger on risk score thresholds:

typescript
// Create an alert rule for high-risk agents
await client.alerts.createRule({
  name: "High risk agent alert",
  condition: "riskScore >= 70",
  severity: "high",
  channels: ["webhook", "email"],
  webhookUrl: "https://hooks.slack.com/...",
});
Risk Scoring