2026-02-02 · Authensor

How to Prevent AI Agents from Deleting Your Files

To prevent AI agents from deleting your files, install SafeClaw (npx @authensor/safeclaw) and create a policy that denies all shell_exec actions containing destructive commands (rm, rmdir, unlink) and restricts file_write to specific directories. SafeClaw intercepts every action before execution, so the rm -rf / command never reaches your operating system. With deny-by-default, file deletion is blocked unless you've written an explicit rule allowing it.

Why This Matters

File deletion is the most common fear users have about AI agents — and it's well-founded. An agent tasked with "cleaning up the project" might interpret that as deleting files it considers unnecessary. A prompt injection could instruct the agent to remove configuration files, wipe directories, or overwrite critical data with empty content. Destructive file operations are irreversible on systems without snapshots, and even with backups, the downtime and data loss from an uncontrolled rm -rf can be severe.

Step-by-Step Instructions

Step 1: Install SafeClaw

npx @authensor/safeclaw

SafeClaw runs as a standalone gating layer. It has zero third-party dependencies and evaluates each policy decision in sub-millisecond time.

Step 2: Get Your API Key

Sign up at safeclaw.onrender.com. The free tier includes a 7-day renewable key with no credit card required.

Step 3: Identify Every Deletion Vector

File deletion isn't just rm. Agents can delete files through:


Your policy must cover all of these vectors.

Step 4: Write Your Policy

Create safeclaw.yaml with explicit deny rules for every deletion vector, plus a default deny that catches anything you missed.

Step 5: Simulate, Then Enforce

# Test your policy without blocking
SAFECLAW_MODE=simulation npx @authensor/safeclaw

Once validated, enforce

SAFECLAW_MODE=enforce npx @authensor/safeclaw

Example Policy

version: "1.0"
default: deny

rules:
# Block all destructive shell commands
- action: shell_exec
command: "rm *"
decision: deny
reason: "Block all rm commands"

- action: shell_exec
command: "rmdir *"
decision: deny
reason: "Block directory removal"

- action: shell_exec
command: "unlink *"
decision: deny
reason: "Block unlink"

- action: shell_exec
command: "shutil.rmtree"
decision: deny
reason: "Block Python recursive delete"

- action: shell_exec
command: "git clean*"
decision: deny
reason: "Block git clean (removes untracked files)"

- action: shell_exec
command: "mv * /dev/null"
decision: deny
reason: "Block move-to-null deletion trick"

# Allow file writes only to safe directories
- action: file_write
path: "./output/**"
decision: allow
reason: "Agent can write to output directory"

- action: file_write
path: "./tmp/**"
decision: allow
reason: "Agent can write to temp directory"

# Allow file reads for project data
- action: file_read
path: "./src/**"
decision: allow
reason: "Agent can read source code"

- action: file_read
path: "./data/**"
decision: allow
reason: "Agent can read data files"

# Allow specific safe shell commands
- action: shell_exec
command: "npm test*"
decision: allow
reason: "Run test suite"

- action: shell_exec
command: "ls *"
decision: allow
reason: "List directory contents"

- action: shell_exec
command: "cat *"
decision: allow
reason: "View file contents"

What Happens When It Works

DENY — Agent attempts to delete a directory:

{
"action": "shell_exec",
"command": "rm -rf ./old-backups",
"decision": "DENY",
"rule": "Block all rm commands",
"timestamp": "2026-02-13T09:15:01Z",
"hash": "c4d5e6f7..."
}

DENY — Agent tries to overwrite a config file via a Python script:

{
"action": "shell_exec",
"command": "python3 -c \"import shutil; shutil.rmtree('./config')\"",
"decision": "DENY",
"rule": "Block Python recursive delete",
"timestamp": "2026-02-13T09:15:03Z",
"hash": "g8h9i0j1..."
}

ALLOW — Agent writes a report to the permitted output directory:

{
"action": "file_write",
"path": "./output/analysis-report.md",
"decision": "ALLOW",
"rule": "Agent can write to output directory",
"timestamp": "2026-02-13T09:15:05Z",
"hash": "k2l3m4n5..."
}

Common Mistakes

  1. Only blocking rm and missing other deletion vectors. Agents are creative. They can delete files using mv to discard paths, writing empty content over files, running git clean, or executing Python scripts with os.remove(). Your policy must address all of these, which is why deny-by-default is essential — anything you don't explicitly allow is blocked.
  1. Using filesystem permissions instead of action gating. Unix file permissions protect against unauthorized users, not against processes running as your user. Your AI agent runs as you — it has every permission you have. SafeClaw operates at a layer above the filesystem, gating actions before they execute regardless of OS-level permissions.
  1. Assuming the agent will follow instructions not to delete files. Prompt-level instructions like "never delete files" provide zero enforcement. They are suggestions to a language model, not security controls. Prompt injection, hallucination, or ambiguous task interpretation can all cause an agent to ignore prompt instructions. Enforcement must happen at the action layer.

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw