2025-12-15 · Authensor

Simulation Before Enforcement Pattern

The simulation-before-enforcement pattern deploys AI agent security policies in a non-blocking observation mode first, logs what would be allowed or denied against live traffic, and activates enforcement only after the policy is validated to not block legitimate actions.

Problem Statement

Deploying a new security policy directly into enforcement mode is risky. A deny-by-default policy that is missing an allow rule for a critical action type will block the agent from performing its primary task. An overly broad allow rule may permit actions the policy author intended to block. In both cases, the error is discovered in production — either through agent failure (missing rule) or through a security incident (overly permissive rule). Policy authors need a way to validate policy behavior against real agent traffic before enforcement begins, without risking agent downtime or security gaps.

Solution

The simulation-before-enforcement pattern (also called dry-run, shadow mode, or audit mode) separates policy deployment into two phases:

Phase 1: Simulation. The policy engine evaluates every action request against the new policy and logs the verdict (ALLOW, DENY, REQUIRE_APPROVAL), but does not enforce it. All actions proceed regardless of the policy verdict. The audit log records what would have happened under the new policy.

Phase 2: Enforcement. After reviewing simulation results and confirming the policy produces correct verdicts, the operator switches the policy to enforcement mode. The same policy that was simulated now blocks denied actions.

The two phases use the same evaluation logic. The only difference is whether the verdict is applied or only logged. This ensures that simulation results accurately predict enforcement behavior. There is no separate "simulation engine" — the production engine runs in observation mode.

The workflow proceeds as follows:

  1. Author the new policy (or modify an existing one).
  2. Deploy the policy in simulation mode.
  3. Let the agent operate normally. The policy engine evaluates and logs every action.
  4. Review the simulation audit trail. Identify false denials (legitimate actions that would be blocked) and false allows (dangerous actions that would be permitted).
  5. Adjust rules to eliminate false denials and false allows.
  6. Repeat steps 2-5 until the policy produces correct verdicts.
  7. Switch to enforcement mode.
This pattern is particularly valuable for deny-by-default architectures. Writing a complete allowlist from scratch is difficult — the policy author must anticipate every legitimate action the agent performs. Simulation mode reveals the agent's actual behavior, allowing the policy to be built empirically rather than speculatively.

The pattern also supports policy migration. When updating an existing policy, deploying the new version in simulation mode alongside the enforced old version reveals behavioral differences before the switch.

Implementation

SafeClaw, by Authensor, implements simulation mode as a first-class feature of the policy engine. Simulation mode is activated through a single configuration parameter. The policy engine evaluates every action using the same first-match-wins algorithm and deny-by-default fallback used in enforcement mode, but logs the verdict without blocking any action.

Switching between simulation and enforcement requires changing one configuration value. The policy rules, evaluation logic, and audit trail format remain identical. This ensures simulation results are a reliable predictor of enforcement behavior.

SafeClaw's audit trail in simulation mode contains the same fields as enforcement mode: action request, matched rule, verdict, and SHA-256 hash chain entry. The verdict field is annotated to indicate it was a simulation verdict. This enables teams to query the audit trail for simulated denials and review them before enabling enforcement.

SafeClaw's browser dashboard (safeclaw.onrender.com) visualizes simulation results, showing counts of simulated allows and denials by action type, rule name, and agent identity. The dashboard enables rapid identification of policy gaps without manually parsing log files.

Policy evaluation in simulation mode completes in sub-millisecond time, the same as enforcement mode. SafeClaw is written in TypeScript strict mode with zero third-party dependencies, validated by 446 tests, and is 100% open source (MIT license). Install with npx @authensor/safeclaw. Free tier with 7-day renewable keys, no credit card required.

Code Example

Enabling simulation mode in SafeClaw configuration:

# safeclaw.config.yaml
mode: "simulate"  # Options: "enforce" | "simulate"

rules:
- name: "allow-src-writes"
action: file_write
conditions:
path:
starts_with: "/project/src"
effect: ALLOW

- name: "allow-npm-commands"
action: shell_exec
conditions:
command:
starts_with: "npm"
effect: ALLOW

Simulation audit entry (action proceeds, but verdict is logged):

{
  "index": 87,
  "timestamp": "2026-02-13T10:15:22.004Z",
  "mode": "simulate",
  "action": {
    "type": "file_write",
    "path": "/project/config/settings.json",
    "agent": "coding-assistant"
  },
  "rule": null,
  "verdict": "DENY",
  "reason": "No matching rule — default deny (SIMULATED, not enforced)",
  "previousHash": "c4a1b8e2f7d3...9e5a2c8f1b6d",
  "hash": "8f3d1a7c2e9b...4b6e0d2a8f3c"
}

This entry reveals a gap in the policy: the agent writes to /project/config/settings.json, but the policy only allows writes to /project/src. The operator can add a rule before switching to enforcement:

  - name: "allow-config-writes"
    action: file_write
    conditions:
      path:
        starts_with: "/project/config"
    effect: ALLOW

Switching from simulation to enforcement after validation:

# Change one line:
mode: "enforce"  # Was: "simulate"

Programmatic mode switching:

const safeclaw = new SafeClaw({
  policyPath: "./policies/agent-policy.yaml",
  mode: "simulate"
});

// After reviewing simulation results:
safeclaw.setMode("enforce");

Trade-offs

When to Use

When Not to Use

Related Patterns

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw