How to Use SafeClaw with the Claude Agent SDK
SafeClaw by Authensor integrates directly with the Claude Agent SDK to gate every tool call before execution, enforcing deny-by-default policies defined in YAML. The Claude Agent SDK manages the full agent loop — tool definitions, message handling, and execution — and SafeClaw intercepts at the tool execution step, evaluating each invocation against your policy in sub-millisecond time.
How the Claude Agent SDK Works
The Claude Agent SDK provides a structured way to build Claude-powered agents with tool use. You define tools as Python or TypeScript functions, the SDK handles the message loop (sending tool definitions to Claude, receiving tool_use blocks, executing tools, and returning tool_result blocks), and Claude iterates until it produces a final response. The SDK abstracts the API mechanics but does not include a safety layer — every tool call the model proposes is executed.
Claude → tool_use block → Agent SDK dispatch → [SafeClaw Policy Check] → tool function or Deny
Quick Start
npx @authensor/safeclaw
Creates a safeclaw.yaml in your project. SafeClaw maps the Claude Agent SDK's tool names directly to policy rules.
Step 1: Define Policies for Agent SDK Tools
# safeclaw.yaml
version: 1
default: deny
policies:
- name: "claude-sdk-file-tools"
description: "Control file operations"
actions:
- tool: "read_file"
effect: allow
constraints:
path_pattern: "src/|docs/|data/**"
- tool: "write_file"
effect: allow
constraints:
path_pattern: "src/|output/"
max_size_bytes: 100000
- tool: "list_directory"
effect: allow
- tool: "delete_file"
effect: deny
- name: "claude-sdk-shell-tools"
description: "Restrict shell access"
actions:
- tool: "bash"
effect: allow
constraints:
command_pattern: "npm test|npm run *|git status|git diff|npx tsc"
- tool: "bash"
effect: deny
constraints:
command_pattern: "rm -rf|sudo |curl * | sh"
- tool: "bash"
effect: deny
- name: "claude-sdk-web-tools"
description: "Control web access"
actions:
- tool: "web_search"
effect: allow
- tool: "fetch_url"
effect: allow
constraints:
url_pattern: "https://docs./|https://api./"
- tool: "fetch_url"
effect: deny
Step 2: Integrate with the Python Agent SDK
import anthropic
from safeclaw import SafeClaw
safeclaw = SafeClaw("./safeclaw.yaml")
client = anthropic.Anthropic()
tools = [
{
"name": "read_file",
"description": "Read a file from the filesystem",
"input_schema": {
"type": "object",
"properties": {"path": {"type": "string"}},
"required": ["path"],
},
},
{
"name": "bash",
"description": "Execute a bash command",
"input_schema": {
"type": "object",
"properties": {"command": {"type": "string"}},
"required": ["command"],
},
},
]
def run_agent(task: str):
messages = [{"role": "user", "content": task}]
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=tools,
messages=messages,
)
if response.stop_reason == "end_turn":
return response.content
tool_results = []
for block in response.content:
if block.type == "tool_use":
decision = safeclaw.evaluate(block.name, block.input)
if decision.allowed:
result = execute_tool(block.name, block.input)
else:
result = f"Action denied by SafeClaw: {decision.reason}"
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
})
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
Step 3: TypeScript Agent SDK Integration
import Anthropic from "@anthropic-ai/sdk";
import { SafeClaw } from "@authensor/safeclaw";
const client = new Anthropic();
const safeclaw = new SafeClaw("./safeclaw.yaml");
async function runAgent(task: string) {
const messages: Anthropic.MessageParam[] = [{ role: "user", content: task }];
while (true) {
const response = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 4096,
tools,
messages,
});
if (response.stop_reason === "end_turn") {
return response.content;
}
const toolResults: Anthropic.ToolResultBlockParam[] = [];
for (const block of response.content) {
if (block.type === "tool_use") {
const decision = safeclaw.evaluate(block.name, block.input as Record<string, any>);
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: decision.allowed
? await executeTool(block.name, block.input)
: Denied by SafeClaw: ${decision.reason},
});
}
}
messages.push({ role: "assistant", content: response.content });
messages.push({ role: "user", content: toolResults });
}
}
Step 4: Handle Claude's Adaptive Behavior
When SafeClaw denies a tool call, the denial message is returned to Claude as a tool_result. Claude can then adapt its approach — choosing a different tool, adjusting arguments, or asking the user for guidance. This creates a feedback loop where the agent respects policy boundaries:
Claude: "I'll delete the old config file" → tool_use: delete_file
SafeClaw: DENY → "Action denied: delete_file not permitted"
Claude: "I can't delete files. Instead, I'll create a new config alongside the old one."
Step 5: Audit the Full Agent Session
npx @authensor/safeclaw audit --session latest
Every tool call in the agent loop — across all iterations — is recorded with hash-chaining. The audit log shows the complete reasoning chain: what Claude proposed, what was allowed, what was denied, and how Claude adapted.
Why SafeClaw
- 446 tests covering policy evaluation, edge cases, and audit integrity
- Deny-by-default — every tool in the Agent SDK must be explicitly allowed
- Sub-millisecond evaluation — no delay in Claude's tool use loop
- Hash-chained audit log — tamper-evident record of the complete agent session
- Works with Claude AND OpenAI — same policies apply across SDK backends
Related Pages
- How to Secure Your Claude Agent with SafeClaw
- How to Secure MCP Servers
- How to Secure Vercel AI SDK Tool Calls
- How to Add Safety Gating to LangChain Agents
Try SafeClaw
Action-level gating for AI agents. Set it up in your browser in 60 seconds.
$ npx @authensor/safeclaw