2026-01-09 · Authensor

Step-by-Step Migration from Uncontrolled Agents to SafeClaw

Overview

This guide covers the process of migrating from AI agents operating without action-level safety controls to a fully enforced SafeClaw deployment. The migration follows a four-phase approach: audit current agent behavior, design policies based on observed actions, validate in simulation mode, and enable enforcement. The process is designed to be non-disruptive — agents continue operating normally during the audit and simulation phases.

Most teams complete the migration in 1-3 days for a single agent and 1-2 weeks for a multi-agent environment. No code changes to the agents themselves are required.

Step-by-Step Process

Step 1: Inventory All Active Agents

Identify every AI agent operating in your environment. For each agent, document:

Step 2: Install SafeClaw

Run npx @authensor/safeclaw to install. The setup wizard walks through initial configuration via the browser dashboard. No credit card is required — the free tier includes 7-day renewable keys. The installation adds zero third-party dependencies to your environment.

Step 3: Enable Audit-Only Mode

Before writing any policy rules, run SafeClaw in audit-only mode. In this mode, SafeClaw logs every action every agent attempts but does not block anything. This provides a baseline of actual agent behavior.

Run audit-only mode for a minimum of 24 hours, ideally covering a full work cycle. The audit trail records:

Step 4: Analyze the Audit Log

Export the audit log from the browser dashboard. Categorize every observed action into three groups:

| Category | Definition | Example |
|----------|-----------|---------|
| Expected and safe | Actions within the agent's intended purpose | Reading source files, running tests |
| Expected but risky | Intended actions that carry risk if unsupervised | Database migrations, production deployments |
| Unexpected | Actions outside the agent's intended scope | Reading credential files, network calls to unknown endpoints |

This categorization directly maps to SafeClaw policy decisions: ALLOW for expected-safe, REQUIRE_APPROVAL for expected-risky, DENY for unexpected.

Step 5: Design Your Policy

Write policy rules based on the audit analysis. Start with deny-by-default (SafeClaw's default behavior) and add explicit rules:

  1. Write DENY rules for every action category you identified as unexpected
  2. Write REQUIRE_APPROVAL rules for every action category you identified as risky
  3. Write ALLOW rules for every action category you identified as safe
  4. Review the policy for completeness — any action not covered by an explicit rule will be denied by default
Reference the Policy Rule Syntax Reference for the full rule format.

Step 6: Validate in Simulation Mode

Enable simulation mode. In simulation mode, SafeClaw evaluates every action against your policy and logs what the decision would be, but does not enforce it. Agents continue operating without interruption.

Run simulation mode for at least one full work cycle (24-48 hours). Review the simulation log for:

Step 7: Enable Enforcement

Switch from simulation mode to enforcement mode. SafeClaw now evaluates every action and enforces the decision (ALLOW, DENY, or REQUIRE_APPROVAL) before the action executes.

Monitor the audit trail closely for the first 48 hours after enforcement. Watch for:

Step 8: Iterate and Refine

Policy tuning is ongoing. After the first week of enforcement, review the audit trail and adjust:

Checklist

Common Mistakes

1. Skipping the audit phase. Writing policies based on assumptions about agent behavior, rather than observed behavior, leads to false denials that disrupt workflows or gaps that miss real risks. Always audit first.

2. Starting with allow-by-default. Some teams try to start permissive and add restrictions later. This defeats the purpose of safety controls. SafeClaw's deny-by-default architecture exists because unknown actions are the highest risk. Start restrictive and open up selectively.

3. Writing overly broad ALLOW rules. Rules like action: file_read, target: "**" (allow all file reads) eliminate the protection file_read gating provides. Use specific directory paths and file patterns.

4. Skipping simulation mode. Going directly from no policy to enforcement causes agent failures that frustrate developers and lead to SafeClaw being disabled. Simulation mode catches policy errors before they affect productivity.

5. Not reviewing REQUIRE_APPROVAL patterns. If the same action is approved every time, it should be an ALLOW rule. If it is denied every time, it should be a DENY rule. REQUIRE_APPROVAL rules that are not genuinely decision points create approval fatigue.

Success Criteria

Migration is successful when the following conditions are met:

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw