2026-02-02 · Authensor

How to Prevent AI Agents from Running rm -rf

To prevent AI agents from running rm -rf, use SafeClaw action-level gating with a deny-by-default policy. SafeClaw intercepts shell_exec actions and blocks commands matching destructive patterns like rm -rf, rm -r, and del /s /q before the agent executes them. Install with npx @authensor/safeclaw.

The Risk

rm -rf / deletes every file on the system recursively, with no confirmation prompt and no recycle bin. When an AI agent has shell access, a single hallucinated command or prompt injection can trigger this. The agent doesn't need root — rm -rf ~ wipes the entire home directory, including source code, SSH keys, credentials, and local databases.

This isn't theoretical. Agents with code execution capabilities routinely generate shell commands. A misinterpreted instruction like "clean up the project" can become rm -rf ./ in the wrong working directory. If the agent is running as your user, it has your permissions. Every file you can delete, it can delete.

Docker containers limit blast radius but don't prevent deletion of mounted volumes. File permissions don't help when the agent runs as your user. Prompt-level instructions ("never run rm") are trivially overridden by prompt injection.

The One-Minute Fix

Step 1: Install SafeClaw in your project.

npx @authensor/safeclaw

Step 2: Run the setup wizard at safeclaw.onrender.com to get your free API key (7-day renewable, no credit card).

Step 3: Add this policy rule to block destructive shell commands:

- action: shell_exec
  pattern: "rm -rf|rm -r|rmdir /s|del /s|find . -delete"
  effect: deny
  reason: "Destructive file deletion commands are blocked"

That's it. Any shell_exec action matching these patterns is denied before the agent can execute it.

Full Policy

name: block-destructive-shell
version: "1.0"
defaultEffect: deny
rules:
  # Block recursive deletion commands
  - action: shell_exec
    pattern: "rm\\s+(-[a-zA-Z]r[a-zA-Z]|--recursive)"
    effect: deny
    reason: "Recursive file deletion blocked"

# Block force deletion
  - action: shell_exec
    pattern: "rm\\s+(-[a-zA-Z]f[a-zA-Z]|--force)"
    effect: deny
    reason: "Forced file deletion blocked"

# Block directory wiping alternatives
  - action: shell_exec
    pattern: "find\\s+.-delete|find\\s+.-exec\\s+rm"
    effect: deny
    reason: "Recursive find-and-delete blocked"

# Block Windows equivalents
  - action: shell_exec
    pattern: "rmdir\\s+/s|del\\s+/s|rd\\s+/s"
    effect: deny
    reason: "Windows recursive deletion blocked"

# Allow safe shell commands
  - action: shell_exec
    pattern: "ls|cat|echo|pwd|whoami|git status|git diff|npm test|node "
    effect: allow
    reason: "Read-only and safe commands permitted"

What Gets Blocked

These action requests are DENIED:

{
  "action": "shell_exec",
  "command": "rm -rf /home/user/project",
  "agent": "code-assistant",
  "result": "DENIED — Recursive file deletion blocked"
}

{
  "action": "shell_exec",
  "command": "rm -rf .",
  "agent": "cleanup-agent",
  "result": "DENIED — Recursive file deletion blocked"
}

{
  "action": "shell_exec",
  "command": "find /tmp -type f -exec rm {} \\;",
  "agent": "maintenance-bot",
  "result": "DENIED — Recursive find-and-delete blocked"
}

What Still Works

These safe actions are ALLOWED:

{
  "action": "shell_exec",
  "command": "ls -la /home/user/project",
  "agent": "code-assistant",
  "result": "ALLOWED — Read-only and safe commands permitted"
}

{
  "action": "shell_exec",
  "command": "npm test",
  "agent": "code-assistant",
  "result": "ALLOWED — Read-only and safe commands permitted"
}

Your agent can still read files, run tests, check git status, and execute non-destructive commands. It just can't delete your filesystem.

Why Other Approaches Don't Work

Docker containers limit damage to the container, but agents often need access to mounted project directories. A rm -rf inside a mounted volume deletes your real files. Docker also adds setup overhead and doesn't give you an audit trail.

File permissions don't help because the agent runs as your user. Everything you can delete, it can delete. You'd need to create a separate restricted user, manage permissions per project, and deal with file ownership issues — and the agent could still delete anything in its own directories.

Prompt instructions like "never run rm" are not security controls. They're suggestions. A prompt injection attack — or even a confidently hallucinated command — bypasses them entirely. Prompt-level safety is not a substitute for action-level enforcement.

SafeClaw evaluates every shell_exec action against your policy in sub-millisecond time, before execution. It's deny-by-default: if a command isn't explicitly allowed, it doesn't run. The policy is enforced in code, not in the prompt. Every denied action is logged in a tamper-proof audit trail with SHA-256 hash chaining. Built with 446 tests, TypeScript strict mode, and zero third-party dependencies.

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw