2025-12-11 · Authensor

How to Use SafeClaw with the Claude Agent SDK

SafeClaw by Authensor integrates directly with the Claude Agent SDK to gate every tool call before execution, enforcing deny-by-default policies defined in YAML. The Claude Agent SDK manages the full agent loop — tool definitions, message handling, and execution — and SafeClaw intercepts at the tool execution step, evaluating each invocation against your policy in sub-millisecond time.

How the Claude Agent SDK Works

The Claude Agent SDK provides a structured way to build Claude-powered agents with tool use. You define tools as Python or TypeScript functions, the SDK handles the message loop (sending tool definitions to Claude, receiving tool_use blocks, executing tools, and returning tool_result blocks), and Claude iterates until it produces a final response. The SDK abstracts the API mechanics but does not include a safety layer — every tool call the model proposes is executed.

Claude → tool_use block → Agent SDK dispatch → [SafeClaw Policy Check] → tool function or Deny

Quick Start

npx @authensor/safeclaw

Creates a safeclaw.yaml in your project. SafeClaw maps the Claude Agent SDK's tool names directly to policy rules.

Step 1: Define Policies for Agent SDK Tools

# safeclaw.yaml version: 1 default: deny policies: - name: "claude-sdk-file-tools" description: "Control file operations" actions: - tool: "read_file" effect: allow constraints: path_pattern: "src/|docs/|data/**" - tool: "write_file" effect: allow constraints: path_pattern: "src/|output/" max_size_bytes: 100000 - tool: "list_directory" effect: allow - tool: "delete_file" effect: deny - name: "claude-sdk-shell-tools" description: "Restrict shell access" actions: - tool: "bash" effect: allow constraints: command_pattern: "npm test|npm run *|git status|git diff|npx tsc" - tool: "bash" effect: deny constraints: command_pattern: "rm -rf|sudo |curl * | sh" - tool: "bash" effect: deny

- name: "claude-sdk-web-tools" description: "Control web access" actions: - tool: "web_search" effect: allow - tool: "fetch_url" effect: allow constraints: url_pattern: "https://docs./|https://api./" - tool: "fetch_url" effect: deny

Step 2: Integrate with the Python Agent SDK

import anthropic
from safeclaw import SafeClaw

safeclaw = SafeClaw("./safeclaw.yaml")
client = anthropic.Anthropic()

tools = [
    {
        "name": "read_file",
        "description": "Read a file from the filesystem",
        "input_schema": {
            "type": "object",
            "properties": {"path": {"type": "string"}},
            "required": ["path"],
        },
    },
    {
        "name": "bash",
        "description": "Execute a bash command",
        "input_schema": {
            "type": "object",
            "properties": {"command": {"type": "string"}},
            "required": ["command"],
        },
    },
]

def run_agent(task: str):
    messages = [{"role": "user", "content": task}]

while True:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            tools=tools,
            messages=messages,
        )

if response.stop_reason == "end_turn":
            return response.content

tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                decision = safeclaw.evaluate(block.name, block.input)

if decision.allowed:
                    result = execute_tool(block.name, block.input)
                else:
                    result = f"Action denied by SafeClaw: {decision.reason}"

tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result,
                })

messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

Step 3: TypeScript Agent SDK Integration

import Anthropic from "@anthropic-ai/sdk";
import { SafeClaw } from "@authensor/safeclaw";

const client = new Anthropic();
const safeclaw = new SafeClaw("./safeclaw.yaml");

async function runAgent(task: string) {
  const messages: Anthropic.MessageParam[] = [{ role: "user", content: task }];

while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 4096,
      tools,
      messages,
    });

if (response.stop_reason === "end_turn") {
      return response.content;
    }

const toolResults: Anthropic.ToolResultBlockParam[] = [];

for (const block of response.content) {
      if (block.type === "tool_use") {
        const decision = safeclaw.evaluate(block.name, block.input as Record<string, any>);

toolResults.push({
          type: "tool_result",
          tool_use_id: block.id,
          content: decision.allowed
            ? await executeTool(block.name, block.input)
            : Denied by SafeClaw: ${decision.reason},
        });
      }
    }

messages.push({ role: "assistant", content: response.content });
    messages.push({ role: "user", content: toolResults });
  }
}

Step 4: Handle Claude's Adaptive Behavior

When SafeClaw denies a tool call, the denial message is returned to Claude as a tool_result. Claude can then adapt its approach — choosing a different tool, adjusting arguments, or asking the user for guidance. This creates a feedback loop where the agent respects policy boundaries:

Claude: "I'll delete the old config file" → tool_use: delete_file
SafeClaw: DENY → "Action denied: delete_file not permitted"
Claude: "I can't delete files. Instead, I'll create a new config alongside the old one."

Step 5: Audit the Full Agent Session

npx @authensor/safeclaw audit --session latest

Every tool call in the agent loop — across all iterations — is recorded with hash-chaining. The audit log shows the complete reasoning chain: what Claude proposed, what was allowed, what was denied, and how Claude adapted.

Why SafeClaw

446 tests covering policy evaluation, edge cases, and audit integrity
Deny-by-default — every tool in the Agent SDK must be explicitly allowed
Sub-millisecond evaluation — no delay in Claude's tool use loop
Hash-chained audit log — tamper-evident record of the complete agent session
Works with Claude AND OpenAI — same policies apply across SDK backends

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw