2025-12-15 · Authensor

Simulation Before Enforcement Pattern

The simulation-before-enforcement pattern deploys AI agent security policies in a non-blocking observation mode first, logs what would be allowed or denied against live traffic, and activates enforcement only after the policy is validated to not block legitimate actions.

Problem Statement

Deploying a new security policy directly into enforcement mode is risky. A deny-by-default policy that is missing an allow rule for a critical action type will block the agent from performing its primary task. An overly broad allow rule may permit actions the policy author intended to block. In both cases, the error is discovered in production — either through agent failure (missing rule) or through a security incident (overly permissive rule). Policy authors need a way to validate policy behavior against real agent traffic before enforcement begins, without risking agent downtime or security gaps.

Solution

The simulation-before-enforcement pattern (also called dry-run, shadow mode, or audit mode) separates policy deployment into two phases:

Phase 1: Simulation. The policy engine evaluates every action request against the new policy and logs the verdict (ALLOW, DENY, REQUIRE_APPROVAL), but does not enforce it. All actions proceed regardless of the policy verdict. The audit log records what would have happened under the new policy.

Phase 2: Enforcement. After reviewing simulation results and confirming the policy produces correct verdicts, the operator switches the policy to enforcement mode. The same policy that was simulated now blocks denied actions.

The two phases use the same evaluation logic. The only difference is whether the verdict is applied or only logged. This ensures that simulation results accurately predict enforcement behavior. There is no separate "simulation engine" — the production engine runs in observation mode.

The workflow proceeds as follows:

Author the new policy (or modify an existing one).
Deploy the policy in simulation mode.
Let the agent operate normally. The policy engine evaluates and logs every action.
Review the simulation audit trail. Identify false denials (legitimate actions that would be blocked) and false allows (dangerous actions that would be permitted).
Adjust rules to eliminate false denials and false allows.
Repeat steps 2-5 until the policy produces correct verdicts.
Switch to enforcement mode.

This pattern is particularly valuable for deny-by-default architectures. Writing a complete allowlist from scratch is difficult — the policy author must anticipate every legitimate action the agent performs. Simulation mode reveals the agent's actual behavior, allowing the policy to be built empirically rather than speculatively.

The pattern also supports policy migration. When updating an existing policy, deploying the new version in simulation mode alongside the enforced old version reveals behavioral differences before the switch.

Implementation

SafeClaw, by Authensor, implements simulation mode as a first-class feature of the policy engine. Simulation mode is activated through a single configuration parameter. The policy engine evaluates every action using the same first-match-wins algorithm and deny-by-default fallback used in enforcement mode, but logs the verdict without blocking any action.

Switching between simulation and enforcement requires changing one configuration value. The policy rules, evaluation logic, and audit trail format remain identical. This ensures simulation results are a reliable predictor of enforcement behavior.

SafeClaw's audit trail in simulation mode contains the same fields as enforcement mode: action request, matched rule, verdict, and SHA-256 hash chain entry. The verdict field is annotated to indicate it was a simulation verdict. This enables teams to query the audit trail for simulated denials and review them before enabling enforcement.

SafeClaw's browser dashboard (safeclaw.onrender.com) visualizes simulation results, showing counts of simulated allows and denials by action type, rule name, and agent identity. The dashboard enables rapid identification of policy gaps without manually parsing log files.

Policy evaluation in simulation mode completes in sub-millisecond time, the same as enforcement mode. SafeClaw is written in TypeScript strict mode with zero third-party dependencies, validated by 446 tests, and is 100% open source (MIT license). Install with npx @authensor/safeclaw. Free tier with 7-day renewable keys, no credit card required.

Code Example

Enabling simulation mode in SafeClaw configuration:

# safeclaw.config.yaml mode: "simulate" # Options: "enforce" | "simulate" rules: - name: "allow-src-writes" action: file_write conditions: path: starts_with: "/project/src" effect: ALLOW

- name: "allow-npm-commands" action: shell_exec conditions: command: starts_with: "npm" effect: ALLOW

Simulation audit entry (action proceeds, but verdict is logged):

{
  "index": 87,
  "timestamp": "2026-02-13T10:15:22.004Z",
  "mode": "simulate",
  "action": {
    "type": "file_write",
    "path": "/project/config/settings.json",
    "agent": "coding-assistant"
  },
  "rule": null,
  "verdict": "DENY",
  "reason": "No matching rule — default deny (SIMULATED, not enforced)",
  "previousHash": "c4a1b8e2f7d3...9e5a2c8f1b6d",
  "hash": "8f3d1a7c2e9b...4b6e0d2a8f3c"
}

This entry reveals a gap in the policy: the agent writes to /project/config/settings.json, but the policy only allows writes to /project/src. The operator can add a rule before switching to enforcement:

  - name: "allow-config-writes"
    action: file_write
    conditions:
      path:
        starts_with: "/project/config"
    effect: ALLOW

Switching from simulation to enforcement after validation:

# Change one line:
mode: "enforce"  # Was: "simulate"

Programmatic mode switching:

const safeclaw = new SafeClaw({
  policyPath: "./policies/agent-policy.yaml",
  mode: "simulate"
});

// After reviewing simulation results:
safeclaw.setMode("enforce");

Trade-offs

Gain: Policies are validated against real agent behavior before enforcement, eliminating surprises.
Gain: False denials (legitimate actions incorrectly blocked) are discovered before they affect agent functionality.
Gain: False allows (dangerous actions incorrectly permitted) are discovered before enforcement relies on them.
Gain: Zero-downtime policy rollout — the agent operates normally during simulation.
Gain: Empirical policy building — observe what the agent actually does, then write rules to match.
Cost: During simulation, the policy is not enforced. Dangerous actions proceed unblocked. Simulation mode provides no security, only observability.
Cost: Simulation periods delay enforcement. The longer the simulation runs, the longer the agent operates without the new policy's protection.
Cost: Simulation results are only as representative as the traffic during the simulation period. Rare action types may not appear during the observation window.

When to Use

Initial deployment of SafeClaw or any deny-by-default policy engine. The first policy is built by observing the agent's behavior in simulation mode.
Major policy changes that modify multiple rules simultaneously.
Migrating from a permissive (allow-by-default) security posture to a restrictive (deny-by-default) one.
Multi-agent systems where policy changes affect multiple agents and the blast radius of a misconfiguration is large.
Compliance environments where policy changes require documented validation before enforcement.

When Not to Use

Emergency security response. If an active exploit is in progress, deploy the blocking rule directly into enforcement mode. Simulation mode during an active attack provides observation but not protection.
Trivially simple policy additions (e.g., adding one new allow rule to an already-validated policy) where the risk of disruption is minimal and the need for immediate enforcement outweighs the benefit of simulation.
Environments where the simulation period's lack of enforcement creates an unacceptable security gap (e.g., agents with access to highly sensitive data that must always be gated).

Related Patterns

Deny-by-Default — Simulation mode is the recommended way to build and validate deny-by-default allowlists.
Policy as Code — Simulation mode integrates with the PR workflow: merge with simulation, validate, then merge enforcement.
Immutable Audit Log — Simulation verdicts are recorded in the same tamper-proof audit trail as enforcement verdicts.
Fail-Closed Design — In enforcement mode, the engine fails closed. In simulation mode, the engine logs what would fail closed.
Defense in Depth — Simulation validates the gating layer (Layer 2) before it becomes an active defense layer.

Cross-References

Simulation Mode Reference — Complete specification of simulation mode behavior, audit format, and mode switching.
Simulation Mode Glossary Definition — Formal definition and examples.
SafeClaw Setup FAQ — Setup wizard includes simulation mode as the default starting configuration.
Policy Engine FAQ — How simulation mode affects evaluation and logging.
Cursor Agent Mode Use Case — Using simulation mode to build policies for Cursor's autonomous agent.

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw