2026-01-05 · Authensor

How to Prevent AI Agents from Deleting Files

SafeClaw by Authensor blocks all file deletion commands by default — including rm, rm -rf, unlink, shred, and programmatic file removal APIs — preventing AI agents from destroying data on your system. Install SafeClaw with npx @authensor/safeclaw and every deletion attempt is stopped before execution and recorded in a tamper-proof audit log.

Why File Deletion Is Dangerous When AI Agents Do It

File deletion is irreversible on most filesystems. An AI agent that runs rm -rf / or even rm -rf ./src can destroy your entire codebase, configuration files, or production data in milliseconds. Agents are particularly prone to path errors — an LLM might construct a relative path that resolves to an unintended directory, or concatenate variables that produce a glob matching far more files than intended. Recursive deletion (-rf) amplifies the damage because the agent deletes an entire directory tree without per-file confirmation. Even non-recursive deletion can remove critical single files like .env, docker-compose.yaml, or database files.

The Exact SafeClaw Policy to Block File Deletion

Add these rules to .safeclaw/policy.yaml:

rules: - id: deny-rm-rf action: shell.exec match: command: "rm -rf*" effect: deny audit: true message: "Recursive forced deletion is permanently denied." - id: deny-rm action: shell.exec match: command: "rm *" effect: deny audit: true message: "File deletion via rm is blocked for AI agents." - id: deny-unlink action: shell.exec match: command: "unlink *" effect: deny audit: true message: "File deletion via unlink is blocked." - id: deny-shred action: shell.exec match: command: "shred *" effect: deny audit: true message: "Secure file deletion via shred is blocked."

- id: deny-file-delete-api action: file.delete match: path: "*" effect: deny audit: true message: "Programmatic file deletion is blocked."

The first four rules cover shell-based deletion commands. The fifth rule uses SafeClaw's file.delete action type to catch programmatic deletion through file system APIs — this is critical because agents using tool-calling frameworks may delete files through SDK methods rather than shell commands.

What Happens When the Agent Tries

When an agent attempts rm -rf ./build:

SafeClaw intercepts the shell.exec action.
The deny-rm-rf rule matches rm -rf*.
The command is blocked. No files are touched.
Hash-chained audit entry:

{
  "timestamp": "2026-02-13T08:33:41Z",
  "action": "shell.exec",
  "command": "rm -rf ./build",
  "effect": "deny",
  "rule": "deny-rm-rf",
  "agent": "build-agent-04",
  "hash": "a7c9d1...chain"
}

The same interception occurs when an agent uses a framework's file deletion tool — SafeClaw catches the file.delete action type regardless of the method.

How to Allow File Deletion with Approval

For build cleanup workflows where agents need to remove temporary files:

rules: - id: deny-rm-rf action: shell.exec match: command: "rm -rf*" effect: deny audit: true message: "Recursive forced deletion is permanently denied." - id: allow-rm-tmp action: shell.exec match: command: "rm /tmp/" effect: allow audit: true - id: allow-rm-build-artifacts action: shell.exec match: command: "rm /build/" effect: approval audit: true approvers: - role: developer timeout: 120 message: "Build artifact deletion requires developer approval."

- id: deny-rm-all action: shell.exec match: command: "rm *" effect: deny audit: true message: "File deletion is blocked outside /tmp."

This allows deletion in /tmp without approval, routes build directory deletions through approval, hard-denies recursive force deletion everywhere, and blocks all other deletions. SafeClaw's 446 tests validate this kind of layered policy ordering.

Verification

npx @authensor/safeclaw simulate --action 'shell.exec' --command 'rm -rf ./src'
Expected: deny, rule: deny-rm-rf

npx @authensor/safeclaw simulate --action 'file.delete' --path '/app/config.yaml'
Expected: deny, rule: deny-file-delete-api

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw