2025-11-10 · Authensor

SafeClaw Simulation Mode Reference

Overview

Simulation mode is a non-enforcing evaluation mode in SafeClaw where the policy engine evaluates every action request against the active policy ruleset but does not block any actions. Instead, it logs what the engine would decide — recording "would allow" or "would deny" outcomes — while permitting all actions to proceed.

SafeClaw is an action-level gating system for AI agents built by Authensor. Simulation mode enables operators to tune policies using real agent behavior before activating enforcement.

Purpose

Simulation mode addresses a critical challenge in policy deployment: writing effective rules without disrupting active agents. It provides three capabilities:

  1. Policy validation — Verify that rules produce expected outcomes for real action patterns
  2. Coverage analysis — Identify which actions are matched by rules and which fall through to deny-by-default
  3. Impact assessment — Quantify how many actions would be blocked before enabling enforcement

Enabling and Disabling Simulation Mode

Via the Dashboard

The browser dashboard provides a simulation mode toggle on the policy management page. Toggling simulation mode takes effect immediately — no restart or redeployment required.

Via Configuration

Simulation mode can be set in the SafeClaw configuration:

{
  "mode": "simulation"
}

Valid mode values:

| Value | Description |
|-------|-------------|
| "enforcement" | Normal operation — actions are allowed or blocked per policy |
| "simulation" | Non-enforcing — all actions proceed, decisions are logged only |

Via the Setup Wizard

The setup wizard (npx @authensor/safeclaw) offers simulation mode as the recommended starting mode during initial deployment:

? Select initial operating mode:
  > Simulation (recommended for new deployments)
    Enforcement

What Happens in Simulation Mode

Evaluation Pipeline

The policy engine follows the identical evaluation pipeline in both modes:

  1. Action request parsing and validation
  2. Rule matching (first-match-wins)
  3. Effect resolution
  4. Deny-by-default fallback
The only difference occurs at the enforcement boundary:

| Mode | Evaluation | Enforcement | Logging |
|------|-----------|-------------|---------|
| Enforcement | Full pipeline | Effect applied (ALLOW/DENY/REQUIRE_APPROVAL) | Actual outcome |
| Simulation | Full pipeline | All actions permitted | Simulated outcome |

Logged Information

Every evaluation in simulation mode produces an audit trail entry with additional simulation fields:

{
  "id": "entry-uuid",
  "sequence": 42,
  "timestamp": "2026-02-13T14:30:00.000Z",
  "action": {
    "type": "shell_exec",
    "agent": "claude-code",
    "command": "rm -rf /tmp/build"
  },
  "evaluation": {
    "matched_rule": "deny-destructive-commands",
    "effect": "DENY",
    "evaluation_time_us": 92
  },
  "simulation": true,
  "simulated_effect": "DENY",
  "actual_outcome": "ALLOWED",
  "previous_hash": "...",
  "hash": "..."
}

Simulation-Specific Fields

| Field | Type | Description |
|-------|------|-------------|
| simulation | boolean | true when the entry was recorded in simulation mode |
| simulated_effect | string | What the engine would have decided: ALLOW, DENY, or REQUIRE_APPROVAL |
| actual_outcome | string | Always ALLOWED in simulation mode (actions are never blocked) |

Would-Allow and Would-Deny Outcomes

Would-Allow

The engine evaluated the action, found a matching rule with effect ALLOW, and would have permitted the action in enforcement mode.

Action: file_write /home/user/project/src/index.ts
Matched Rule: allow-project-writes
Simulated Effect: ALLOW (would allow)
Actual Outcome: ALLOWED

Would-Deny

The engine evaluated the action, found a matching DENY rule (or no matching rule, triggering deny-by-default), and would have blocked the action in enforcement mode.

Action: file_read /etc/shadow
Matched Rule: (none — deny-by-default)
Simulated Effect: DENY (would deny)
Actual Outcome: ALLOWED (simulation mode)

Would-Require-Approval

The engine evaluated the action and found a matching rule with effect REQUIRE_APPROVAL. In enforcement mode, this action would have been held for human review.

Action: shell_exec "npm publish"
Matched Rule: approve-publish-commands
Simulated Effect: REQUIRE_APPROVAL (would require approval)
Actual Outcome: ALLOWED (simulation mode)

Using Simulation for Policy Tuning

Recommended Workflow

  1. Deploy in simulation mode — Start SafeClaw with mode: "simulation"
  2. Run agents normally — Let AI agents perform their typical workloads
  3. Review simulation logs — Use the dashboard to analyze would-allow and would-deny patterns
  4. Adjust rules — Add ALLOW rules for legitimate actions that were would-denied; add DENY rules for dangerous actions that were would-allowed
  5. Repeat — Run another simulation cycle to verify adjustments
  6. Switch to enforcement — Once the policy produces expected outcomes, enable enforcement mode

Dashboard Visualization

The dashboard provides simulation-specific views:

| View | Description |
|------|-------------|
| Simulation Summary | Count of would-allow, would-deny, and would-require-approval decisions |
| Action Breakdown | Simulated outcomes grouped by action type (file_write, file_read, shell_exec, network) |
| Agent Breakdown | Simulated outcomes grouped by agent identity |
| Unmatched Actions | Actions that fell through to deny-by-default (candidates for new ALLOW rules) |
| Timeline | Chronological view of simulated decisions with filtering |

Identifying Policy Gaps

The most valuable simulation insight is the unmatched actions view. These are actions that no rule matched, meaning they would be denied by default. For each unmatched action, operators decide:

Transitioning from Simulation to Enforcement

Pre-Transition Checklist

Before switching to enforcement mode, verify:

| Check | Method |
|-------|--------|
| All legitimate agent actions have matching ALLOW rules | Review unmatched actions — none should be legitimate |
| Dangerous actions are correctly denied | Verify would-deny outcomes for known-dangerous patterns |
| REQUIRE_APPROVAL rules are configured for high-impact actions | Confirm approval queue rules match operational needs |
| Policy rule ordering is correct | Review first-match-wins priority in the dashboard |
| Simulation has run for sufficient duration | Cover typical agent workloads (recommended: 24-48 hours minimum) |

Switching Modes

Change the mode in the dashboard or configuration:

{
  "mode": "enforcement"
}

The switch is immediate. From the moment enforcement mode activates, the policy engine blocks DENY actions and holds REQUIRE_APPROVAL actions. There is no gradual rollout — the transition is atomic.

Rollback

If enforcement causes unexpected disruption, switch back to simulation mode immediately via the dashboard toggle. This is also immediate and restores non-enforcing behavior.

Simulation Mode and the Audit Trail

Simulation entries are stored in the same audit trail as enforcement entries. They participate in the same SHA-256 hash chain, maintaining tamper-proof integrity. The simulation: true flag distinguishes simulation entries from enforcement entries.

This means the audit trail provides a complete history across both modes — operators can trace the transition from simulation to enforcement and review the decision history on both sides.

Performance in Simulation Mode

Simulation mode has identical performance characteristics to enforcement mode:

| Metric | Value |
|--------|-------|
| Evaluation latency | Sub-millisecond (< 1ms) |
| Network round-trips | Zero |
| Overhead vs. enforcement | Negligible (one additional boolean field per entry) |

The only additional cost is the slightly larger audit entry due to the simulation, simulated_effect, and actual_outcome fields.

Related References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw