2025-12-10 · Authensor

How Do I Sandbox an AI Agent? Complete Isolation Guide

Sandboxing an AI agent means restricting its capabilities to only the actions required for its task — preventing it from accessing files, running commands, or making network requests outside a defined boundary. SafeClaw by Authensor provides policy-based sandboxing that works at the action level: every file read, file write, shell command, and network request is evaluated against your rules before execution, creating an effective sandbox without requiring Docker, VMs, or complex infrastructure.

Sandboxing Approaches Compared

| Approach | Isolation Level | Setup Complexity | Granularity | Performance |
|----------|----------------|-----------------|-------------|-------------|
| SafeClaw policies | Action-level | Low (one YAML file) | Per-action, per-path | Sub-millisecond |
| Docker container | Process-level | Medium | Volume mounts, network rules | Container startup overhead |
| Virtual machine | System-level | High | Full OS isolation | Significant resource overhead |
| OS user permissions | User-level | Medium | Coarse (per-directory) | Native |

SafeClaw provides the finest granularity with the lowest overhead. For most AI agent workflows, policy-based sandboxing is sufficient. For high-risk or untrusted agents, combine SafeClaw with container isolation for defense in depth.

How to Sandbox with SafeClaw

Quick Start

npx @authensor/safeclaw

Sandbox Policy: Strict Isolation

# safeclaw.config.yaml
rules:
  # FILESYSTEM SANDBOX
  # Agent can only read within the project
  - action: file.read
    path: "/home/dev/project/**"
    decision: allow

# Agent can only write to src/ directory
- action: file.write
path: "/home/dev/project/src/**"
decision: allow

# No file deletion anywhere
- action: file.delete
path: "**"
decision: deny
reason: "Sandboxed agent cannot delete files"

# Block reads outside the project
- action: file.read
path: "**"
decision: deny
reason: "Sandboxed agent cannot read outside project"

# Block writes outside src/
- action: file.write
path: "**"
decision: deny
reason: "Sandboxed agent cannot write outside src/"

# SHELL SANDBOX
# Allow only test commands
- action: shell.execute
command_pattern: "npm test*"
decision: allow

- action: shell.execute
command_pattern: "npm run lint*"
decision: allow

# Block all other shell commands
- action: shell.execute
command_pattern: "**"
decision: deny
reason: "Sandboxed agent cannot run arbitrary commands"

# NETWORK SANDBOX
# Block all outbound network
- action: network.request
host: "**"
decision: deny
reason: "Sandboxed agent has no network access"

This policy creates a tight sandbox:


Building a Sandbox Step by Step

Step 1: Start with Total Deny

rules:
  - action: "**"
    decision: deny
    reason: "All actions denied — sandbox baseline"

Step 2: Run in Simulation Mode

mode: simulation
rules:
  - action: "**"
    decision: deny

Run your agent workflow normally. SafeClaw logs every action the agent attempts without blocking anything. Review the audit log to see exactly what the agent needs.

Step 3: Allow Only Observed Necessities

From the simulation log, you might see:


Write allow rules only for these patterns:

rules:
  - action: file.read
    path: "src/*/.ts"
    decision: allow

- action: file.write
path: "src/utils/*/.ts"
decision: allow

- action: shell.execute
command_pattern: "npm test*"
decision: allow

- action: "**"
decision: deny

Step 4: Switch to Enforcement

mode: enforcement

The sandbox is now active. The agent is restricted to exactly the permissions it demonstrated it needed.

Combining SafeClaw with Docker

For maximum isolation, run SafeClaw inside a Docker container:

FROM node:20-slim
WORKDIR /app
COPY . .
RUN npx @authensor/safeclaw
# docker-compose.yml
services:
  agent:
    build: .
    volumes:
      - ./src:/app/src:rw
      - ./tests:/app/tests:ro
    network_mode: none  # No network at container level

This gives you two isolation layers:


  1. Docker restricts the container's view of the host filesystem and network

  2. SafeClaw restricts what the agent can do within the container


Why SafeClaw

Related Pages

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw