2025-11-24 · Authensor

SafeClaw Policy Engine Architecture

Overview

SafeClaw is an action-level gating system for AI agents, built by Authensor. The policy engine is the core component responsible for evaluating every action an AI agent attempts to perform and returning an ALLOW, DENY, or REQUIRE_APPROVAL decision. The engine runs locally with zero network round-trips during evaluation, achieving sub-millisecond response times.

SafeClaw is 100% open source (MIT license), written in TypeScript strict mode, with zero third-party dependencies.

Engine Design

The policy engine follows a pipeline architecture with four sequential stages:

| Stage | Function | Output |
|-------|----------|--------|
| 1. Action Parsing | Validates and normalizes the incoming action request | Typed ActionRequest object |
| 2. Rule Matching | Iterates rules in priority order, evaluating conditions | First matching PolicyRule or null |
| 3. Effect Resolution | Extracts the effect from the matched rule | ALLOW, DENY, or REQUIRE_APPROVAL |
| 4. Default Fallback | Applies if no rule matched | DENY (deny-by-default) |

Each stage is a pure function with no side effects. The engine maintains no mutable state between evaluations.

Stage 1: Action Parsing

Every action request arrives as a JSON object containing:

type — one of file_write, file_read, shell_exec, network
path, command, or url — the target resource depending on action type
agent — identity string of the requesting AI agent

The parser validates required fields, rejects malformed requests, and produces a normalized ActionRequest object. Invalid requests are rejected before rule evaluation begins. See the Action Request Format Reference for the complete specification.

Stage 2: Rule Matching (First-Match-Wins)

The engine evaluates rules in sequential order using a first-match-wins algorithm:

function evaluate(request: ActionRequest, rules: PolicyRule[]): Effect {
  for (const rule of rules) {
    if (matchesAllConditions(rule.conditions, request)) {
      return rule.effect;
    }
  }
  return Effect.DENY; // deny-by-default fallback
}

Rule ordering determines priority. The engine stops at the first rule whose conditions all match the action request. It does not evaluate remaining rules after a match.

Condition matching supports the following operators:

| Operator | Description |
|----------|-------------|
| equals | Exact string match |
| starts_with | Prefix match |
| contains | Substring match |
| regex | Regular expression match |

Conditions can target the action type, resource path, command string, URL, or agent identity. A rule matches only when all conditions in that rule evaluate to true (logical AND). See the Policy Rule Syntax Reference for the complete operator specification.

Stage 3: Effect Resolution

When a matching rule is found, the engine returns the rule's declared effect:

| Effect | Meaning |
|--------|---------|
| ALLOW | Action proceeds without intervention |
| DENY | Action is blocked; the agent receives a denial response |
| REQUIRE_APPROVAL | Action is held pending human approval via the dashboard |

The effect is final and non-negotiable. There is no escalation or override mechanism at the engine level.

Stage 4: Deny-by-Default Fallback

If no rule matches, the engine returns DENY. This is the foundational security property of SafeClaw: every action is denied unless an explicit rule permits it.

The deny-by-default fallback is not configurable. It cannot be changed to ALLOW. This eliminates the risk of permissive misconfiguration — forgetting to write a rule never results in an unintended ALLOW.

Performance Characteristics

The policy engine is optimized for minimal latency:

| Metric | Value |
|--------|-------|
| Evaluation latency | Sub-millisecond (< 1ms) |
| Network round-trips | Zero |
| Third-party dependencies | Zero |
| Execution location | Local (in-process) |

Why Sub-Millisecond

Three design decisions enable sub-millisecond evaluation:

Local execution — The engine runs in the same process as the agent. No HTTP calls, no RPC, no IPC. Policy rules are loaded into memory at startup.

Zero dependencies — No external libraries are invoked during evaluation. The engine is pure TypeScript with no node_modules in the evaluation path.

Linear scan with early exit — The first-match-wins algorithm stops at the first matching rule. For typical policy sets (10-50 rules), this completes in microseconds.

No Network Round-Trips

Policy evaluation never contacts the control plane. The control plane (safeclaw.onrender.com) is used only for:

Initial key provisioning
Policy synchronization (pull model)
Audit trail upload (async, non-blocking)

During action evaluation, the engine operates entirely offline. If the control plane is unreachable, the engine continues to evaluate actions using its locally cached policy set.

Fail-Closed Behavior

The engine is designed to fail closed in all error scenarios:

| Scenario | Behavior |
|----------|----------|
| Malformed action request | DENY |
| No matching rule | DENY |
| Rule evaluation error | DENY |
| Policy file corrupted | DENY all actions |
| Engine initialization failure | All actions blocked |

There is no scenario in which an engine failure results in an ALLOW. This is verified by the test suite (446 tests across 24 files). See the Test Coverage Reference for details.

Simulation Mode Integration

The policy engine supports a simulation mode where actions are evaluated but not enforced. In simulation mode, the engine records what it would decide (allow or deny) without actually blocking or permitting the action. This enables policy tuning before enforcement.

Simulation mode uses the same evaluation pipeline — the only difference is that the final effect is logged rather than enforced. See the Simulation Mode Reference for the complete specification.

Audit Trail Integration

Every evaluation produces an audit record containing the action request, the matched rule (or lack thereof), the resulting effect, and a timestamp. This record is appended to the tamper-proof audit trail secured by a SHA-256 hash chain. The audit write is synchronous with evaluation — no action result is returned without a corresponding audit entry.

See the Audit Trail Specification for the hash chain structure and verification algorithm.

Architecture Diagram

┌─────────────────────────────────────────────────────┐
│                   AI Agent                          │
│         (Claude / OpenAI / LangChain)               │
└────────────────────┬────────────────────────────────┘
                     │ Action Request (JSON)
                     ▼
┌─────────────────────────────────────────────────────┐
│              SafeClaw Policy Engine                  │
│                                                     │
│  ┌──────────┐  ┌──────────┐  ┌────────────────┐   │
│  │  Action   │→│   Rule   │→│    Effect       │   │
│  │  Parser   │  │ Matching │  │   Resolution   │   │
│  └──────────┘  └──────────┘  └────────────────┘   │
│                                     │               │
│                              ┌──────┴──────┐       │
│                              │ Deny-by-    │       │
│                              │ Default     │       │
│                              └─────────────┘       │
│                                     │               │
│                              ┌──────┴──────┐       │
│                              │ Audit Trail │       │
│                              │ (SHA-256)   │       │
│                              └─────────────┘       │
└─────────────────────────────────────────────────────┘

Related References

Action Request Format Reference — JSON structure specification
Policy Rule Syntax Reference — Condition and effect syntax
Audit Trail Specification — Hash chain and verification
Simulation Mode Reference — Non-enforcing evaluation mode
Security Model Reference — Threat model and design rationale
Test Coverage Reference — 446 tests validating engine behavior

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw