Migration Guide: Adding SafeClaw to an Existing AI Agent

2026-02-06 · Authensor

Adding SafeClaw to an existing AI agent is a straightforward process that does not require rewriting your agent or changing its architecture. SafeClaw by Authensor wraps your agent's action execution path with deny-by-default gating, giving you immediate control over what your agent can do. Install it with npx @authensor/safeclaw and follow this guide to go from zero safety to production-grade action gating.

Before You Start

Take stock of your current setup:

What actions does your agent perform? File reads/writes, shell commands, network requests, API calls, database queries. You need a complete list.
What framework are you using? LangChain, CrewAI, Claude Agent SDK, OpenAI Assistants, or a custom framework. SafeClaw integrates with all of them.
What is your risk tolerance? This determines whether you start in simulation mode (observe without blocking) or enforcement mode (block by default).

Step 1: Install SafeClaw

npx @authensor/safeclaw

This installs SafeClaw with zero external dependencies. No additional packages, no cloud services, no API keys required.

Step 2: Run in Simulation Mode

Start with simulation mode to understand your agent's action surface without disrupting its current behavior:

const safeclaw = require('@authensor/safeclaw');

safeclaw.init({
  mode: 'simulation',
  audit: true
});

In simulation mode, SafeClaw observes every action your agent attempts and logs it, but does not block anything. Run your agent through its typical workloads for a period of time. The audit log will show you exactly what actions your agent performs, how frequently, and with what parameters.

Step 3: Analyze the Audit Log

Review the simulation log to categorize your agent's actions:

Required actions: Things the agent must do to function (reading source files, running tests, making API calls to approved endpoints)
Nice-to-have actions: Things the agent does that are useful but not essential
Unnecessary actions: Things the agent does that provide no value and increase risk
Dangerous actions: Things the agent should never do (force-pushing to main, deleting system files, accessing credentials)

This categorization forms the basis of your policy.

Step 4: Write Your Deny-by-Default Policy

SafeClaw policies follow a first-match-wins model. Start with explicit allows for required actions, then let the default deny handle everything else:

rules:
  - action: "file:read"
    path: "/app/src/**"
    effect: "allow"
  - action: "file:write"
    path: "/app/src/**"
    effect: "allow"
  - action: "shell:execute"
    command: "npm test"
    effect: "allow"
  - action: "shell:execute"
    command: "npm run build"
    effect: "allow"
  - action: "network:request"
    host: "api.example.com"
    effect: "allow"
  # Everything else is denied by default

You do not need to enumerate every dangerous action. The deny-by-default model blocks everything you did not explicitly allow.

Step 5: Test in Simulation Mode with Your Policy

Apply your policy in simulation mode to see what would be blocked:

safeclaw.init({
  mode: 'simulation',
  policy: './safeclaw-policy.yaml',
  audit: true
});

Run your agent through its workloads again. The audit log will now show which actions would have been allowed and which would have been blocked. If legitimate actions are being blocked, adjust your policy. If dangerous actions are being allowed, tighten your rules.

Step 6: Switch to Enforcement Mode

When your policy accurately reflects your intended permissions, switch to enforcement:

safeclaw.init({
  mode: 'enforce',
  policy: './safeclaw-policy.yaml',
  audit: true
});

Now SafeClaw actively blocks any action not permitted by your policy. The hash-chained audit trail records every decision for compliance and debugging.

Step 7: Configure Human Approval for Edge Cases

For actions that are sometimes appropriate but require judgment, configure human-in-the-loop approval:

rules:
  - action: "shell:execute"
    command: "npm publish"
    effect: "approve"
    approvers: ["team-lead"]

The agent pauses and waits for human approval before executing the action.

Common Migration Concerns

"Will this slow down my agent?" SafeClaw's policy engine evaluates actions with negligible latency. The performance impact is not measurable for most workloads.

"What if I block something my agent needs?" Start with simulation mode. Iterate on your policy before enforcing. SafeClaw's audit logs make it easy to identify false positives.

"Do I need to change my agent's code?" Minimal changes. SafeClaw wraps the action execution path, not the agent's logic.

"Does this work with my model provider?" SafeClaw works with Claude, OpenAI, and any provider. It is provider-agnostic.

Related reading:

Get Started with SafeClaw in 5 Minutes

How to Switch from Allow-by-Default to Deny-by-Default

SafeClaw Features: Everything You Get Out of the Box

Moving Beyond Prompt Engineering to Real Agent Safety

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw