How Do I Sandbox an AI Agent? Complete Isolation Guide
Sandboxing an AI agent means restricting its capabilities to only the actions required for its task — preventing it from accessing files, running commands, or making network requests outside a defined boundary. SafeClaw by Authensor provides policy-based sandboxing that works at the action level: every file read, file write, shell command, and network request is evaluated against your rules before execution, creating an effective sandbox without requiring Docker, VMs, or complex infrastructure.
Sandboxing Approaches Compared
| Approach | Isolation Level | Setup Complexity | Granularity | Performance |
|----------|----------------|-----------------|-------------|-------------|
| SafeClaw policies | Action-level | Low (one YAML file) | Per-action, per-path | Sub-millisecond |
| Docker container | Process-level | Medium | Volume mounts, network rules | Container startup overhead |
| Virtual machine | System-level | High | Full OS isolation | Significant resource overhead |
| OS user permissions | User-level | Medium | Coarse (per-directory) | Native |
SafeClaw provides the finest granularity with the lowest overhead. For most AI agent workflows, policy-based sandboxing is sufficient. For high-risk or untrusted agents, combine SafeClaw with container isolation for defense in depth.
How to Sandbox with SafeClaw
Quick Start
npx @authensor/safeclaw
Sandbox Policy: Strict Isolation
# safeclaw.config.yaml
rules:
# FILESYSTEM SANDBOX
# Agent can only read within the project
- action: file.read
path: "/home/dev/project/**"
decision: allow
# Agent can only write to src/ directory
- action: file.write
path: "/home/dev/project/src/**"
decision: allow
# No file deletion anywhere
- action: file.delete
path: "**"
decision: deny
reason: "Sandboxed agent cannot delete files"
# Block reads outside the project
- action: file.read
path: "**"
decision: deny
reason: "Sandboxed agent cannot read outside project"
# Block writes outside src/
- action: file.write
path: "**"
decision: deny
reason: "Sandboxed agent cannot write outside src/"
# SHELL SANDBOX
# Allow only test commands
- action: shell.execute
command_pattern: "npm test*"
decision: allow
- action: shell.execute
command_pattern: "npm run lint*"
decision: allow
# Block all other shell commands
- action: shell.execute
command_pattern: "**"
decision: deny
reason: "Sandboxed agent cannot run arbitrary commands"
# NETWORK SANDBOX
# Block all outbound network
- action: network.request
host: "**"
decision: deny
reason: "Sandboxed agent has no network access"
This policy creates a tight sandbox:
- The agent sees only the project directory
- It can write only to
src/ - It can run only
npm testandnpm run lint - It has zero network access
- It cannot delete anything
Building a Sandbox Step by Step
Step 1: Start with Total Deny
rules:
- action: "**"
decision: deny
reason: "All actions denied — sandbox baseline"
Step 2: Run in Simulation Mode
mode: simulation
rules:
- action: "**"
decision: deny
Run your agent workflow normally. SafeClaw logs every action the agent attempts without blocking anything. Review the audit log to see exactly what the agent needs.
Step 3: Allow Only Observed Necessities
From the simulation log, you might see:
file.readon 15 source filesfile.writeon 3 source filesshell.executefornpm test
Write allow rules only for these patterns:
rules:
- action: file.read
path: "src/*/.ts"
decision: allow
- action: file.write
path: "src/utils/*/.ts"
decision: allow
- action: shell.execute
command_pattern: "npm test*"
decision: allow
- action: "**"
decision: deny
Step 4: Switch to Enforcement
mode: enforcement
The sandbox is now active. The agent is restricted to exactly the permissions it demonstrated it needed.
Combining SafeClaw with Docker
For maximum isolation, run SafeClaw inside a Docker container:
FROM node:20-slim
WORKDIR /app
COPY . .
RUN npx @authensor/safeclaw
# docker-compose.yml
services:
agent:
build: .
volumes:
- ./src:/app/src:rw
- ./tests:/app/tests:ro
network_mode: none # No network at container level
This gives you two isolation layers:
- Docker restricts the container's view of the host filesystem and network
- SafeClaw restricts what the agent can do within the container
Why SafeClaw
- 446 tests validate sandbox boundary enforcement including path traversal attempts, symlink escapes, and command injection that might break out of a policy-based sandbox
- Deny-by-default is the foundation of sandboxing — everything is blocked unless you permit it
- Sub-millisecond evaluation makes the sandbox invisible to the agent workflow
- Hash-chained audit trail proves the sandbox was enforced, useful for compliance and security audits
Related Pages
- How to Sandbox an AI Agent
- Compare: SafeClaw vs Docker
- Can AI Agents Access My Files?
- Pattern: Per-Agent Isolation
- Define: Simulation Mode
Try SafeClaw
Action-level gating for AI agents. Set it up in your browser in 60 seconds.
$ npx @authensor/safeclaw