Tool Drift Detection

Detect when MCP tool definitions change unexpectedly after registration — addressing OWASP MCP Top 10 #6 (Tool Definition Integrity). ShieldAgent hashes the full tool manifest at connection time and compares on every subsequent request, blocking or alerting when tool drift is detected.

The Threat

OWASP MCP Top 10 #6 (Tool Definition Integrity) describes attacks where a malicious or compromised MCP server silently alters its tool definitions after an agent has already approved and cached them. Because most MCP clients trust the tool manifest they received at startup, the agent continues calling the tool — now executing a different, attacker-controlled operation.

Schema substitution

An attacker modifies a tool's JSON schema so the agent sends data to a different endpoint or with different parameters than intended.

Description poisoning

The tool's description is changed post-registration to include prompt injection payloads that are fed back to the agent on the next capability discovery call.

Supply-chain drift

A legitimate MCP server is updated by its vendor and its tool manifest changes without the security team's knowledge, breaking security assumptions.

How ShieldAgent Detects It

When an agent first connects through the proxy, ShieldAgent records the tools available on the MCP server as an approved baseline. This baseline captures the tool names, schemas, and descriptions the agent was originally authorized to use.

On every subsequent tools/list response, the current set of tools is compared against the approved baseline. Any change triggers a drift event. Depending on configuration, the response is blocked or passed through with an alert.

tools/list response

→

Re-hash manifest

→

Compare to baseline

→

Match → pass through|Mismatch → drift event

What is monitored

ShieldAgent monitors the following aspects of each tool definition for changes:

Field	Why it matters
name	Identifies the tool. Rename = new tool or deceptive substitution.
description	Fed to the agent as a capability hint — injection target.
inputSchema (full JSON Schema)	Determines what data the agent sends to the tool.
annotations (if present)	Capability flags like readOnly, destructive — a change here is policy-relevant.

Version Tracking

ShieldAgent maintains a version history for every server's tool definitions. When you acknowledge and approve a drift event (via the dashboard or API), the updated tools become the new approved baseline. Previous baselines are preserved for audit purposes.

Version field	Description
baselineVersion	Version number. 1 = initial registration, increments with each approved change.
capturedAt	When this tool definition version was first observed.
approvedAt	When a team member approved the change. Empty if still pending review.
approvedBy	Who approved the change.

Configuration

Setting	Default	Description
Tool drift detection	true	Enable tool definition integrity checks.
Action on drift	alert	Action on drift: alert (log + audit event) or block (reject the response).
Include annotations	true	Include MCP tool annotations in the baseline. Set false to ignore annotation-only changes.

Audit Events & API

Every drift detection is persisted as a tool_drift audit event including the previous and current hash, the diff of changed tool names, and the action taken.

json

{
  "id": "aev_...",
  "agentId": "agt_...",
  "tenantId": "ten_...",
  "eventType": "tool_drift",
  "action": "alert",
  "riskScore": 75,
  "details": {
    "serverId": "srv_...",
    "serverName": "my-mcp-server",
    "previousHash": "a3f8...",
    "currentHash": "c912...",
    "baselineVersion": 2,
    "changedTools": ["read_file"],
    "addedTools": [],
    "removedTools": []
  },
  "timestamp": "2026-04-25T10:00:00.000Z"
}

SDK & Dashboard

View tool drift events—Dashboard: Audit Trail → filter by event type "tool_drift". SDK: client.auditEvents.list({ eventType: "tool_drift" })

View tool manifest baseline—Dashboard: MCP Servers → [server] → Tool Manifest tab.

Approve a drifted manifest—Dashboard: MCP Servers → [server] → Tool Manifest → Approve. SDK: client.servers.approveManifest(serverId)

Policy Integration

Use security.toolDrift.detected as a policy condition to automatically block requests when a drift event is active on any tool the agent is attempting to use:

json

{
  "name": "Block on active tool drift",
  "priority": 5,
  "conditions": [
    { "field": "security.toolDrift.detected", "op": "eq", "value": true }
  ],
  "action": "block",
  "response": {
    "code": 403,
    "message": "Tool definition has changed since last approval. Request blocked pending review."
  }
}