2025-12-10 · Authensor

How Do I Sandbox an AI Agent? Complete Isolation Guide

Sandboxing an AI agent means restricting its capabilities to only the actions required for its task — preventing it from accessing files, running commands, or making network requests outside a defined boundary. SafeClaw by Authensor provides policy-based sandboxing that works at the action level: every file read, file write, shell command, and network request is evaluated against your rules before execution, creating an effective sandbox without requiring Docker, VMs, or complex infrastructure.

Sandboxing Approaches Compared

| Approach | Isolation Level | Setup Complexity | Granularity | Performance |
|----------|----------------|-----------------|-------------|-------------|
| SafeClaw policies | Action-level | Low (one YAML file) | Per-action, per-path | Sub-millisecond |
| Docker container | Process-level | Medium | Volume mounts, network rules | Container startup overhead |
| Virtual machine | System-level | High | Full OS isolation | Significant resource overhead |
| OS user permissions | User-level | Medium | Coarse (per-directory) | Native |

SafeClaw provides the finest granularity with the lowest overhead. For most AI agent workflows, policy-based sandboxing is sufficient. For high-risk or untrusted agents, combine SafeClaw with container isolation for defense in depth.

How to Sandbox with SafeClaw

Quick Start

npx @authensor/safeclaw

Sandbox Policy: Strict Isolation

# safeclaw.config.yaml rules: # FILESYSTEM SANDBOX # Agent can only read within the project - action: file.read path: "/home/dev/project/**" decision: allow # Agent can only write to src/ directory - action: file.write path: "/home/dev/project/src/**" decision: allow # No file deletion anywhere - action: file.delete path: "**" decision: deny reason: "Sandboxed agent cannot delete files" # Block reads outside the project - action: file.read path: "**" decision: deny reason: "Sandboxed agent cannot read outside project" # Block writes outside src/ - action: file.write path: "**" decision: deny reason: "Sandboxed agent cannot write outside src/" # SHELL SANDBOX # Allow only test commands - action: shell.execute command_pattern: "npm test*" decision: allow - action: shell.execute command_pattern: "npm run lint*" decision: allow # Block all other shell commands - action: shell.execute command_pattern: "**" decision: deny reason: "Sandboxed agent cannot run arbitrary commands"

# NETWORK SANDBOX # Block all outbound network - action: network.request host: "**" decision: deny reason: "Sandboxed agent has no network access"

This policy creates a tight sandbox:

The agent sees only the project directory

It can write only to src/

It can run only npm test and npm run lint

It has zero network access

It cannot delete anything

Building a Sandbox Step by Step

Step 1: Start with Total Deny

rules:
  - action: "**"
    decision: deny
    reason: "All actions denied — sandbox baseline"

Step 2: Run in Simulation Mode

mode: simulation
rules:
  - action: "**"
    decision: deny

Run your agent workflow normally. SafeClaw logs every action the agent attempts without blocking anything. Review the audit log to see exactly what the agent needs.

Step 3: Allow Only Observed Necessities

From the simulation log, you might see:

file.read on 15 source files

file.write on 3 source files

shell.execute for npm test

Write allow rules only for these patterns:

rules: - action: file.read path: "src/*/.ts" decision: allow - action: file.write path: "src/utils/*/.ts" decision: allow - action: shell.execute command_pattern: "npm test*" decision: allow

- action: "**" decision: deny

Step 4: Switch to Enforcement

mode: enforcement

The sandbox is now active. The agent is restricted to exactly the permissions it demonstrated it needed.

Combining SafeClaw with Docker

For maximum isolation, run SafeClaw inside a Docker container:

FROM node:20-slim
WORKDIR /app
COPY . .
RUN npx @authensor/safeclaw

# docker-compose.yml
services:
  agent:
    build: .
    volumes:
      - ./src:/app/src:rw
      - ./tests:/app/tests:ro
    network_mode: none  # No network at container level

This gives you two isolation layers:

Docker restricts the container's view of the host filesystem and network

SafeClaw restricts what the agent can do within the container

Why SafeClaw

446 tests validate sandbox boundary enforcement including path traversal attempts, symlink escapes, and command injection that might break out of a policy-based sandbox
Deny-by-default is the foundation of sandboxing — everything is blocked unless you permit it
Sub-millisecond evaluation makes the sandbox invisible to the agent workflow
Hash-chained audit trail proves the sandbox was enforced, useful for compliance and security audits

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw