2026-01-28 · Authensor

How to Control What Your AI Agent Can Do

To control what your AI agent can do, install SafeClaw (npx @authensor/safeclaw) and define a policy that specifies exactly which actions are allowed, denied, or require human approval. SafeClaw evaluates four action types — file_read, file_write, shell_exec, and network — against your policy rules using first-match-wins logic. Every action the agent proposes is checked before it executes, giving you precise control over your agent's capabilities.

Why This Matters

AI agents don't have a built-in concept of permissions. Unlike human users with OS-level access controls, an agent running as your user process inherits all of your permissions. It can read every file you can read, run every command you can run, and access every network endpoint you can reach. The Clawdbot incident — where 1.5 million API keys were leaked — happened because the agent had no permission boundaries at all. Controlling agent permissions requires an external enforcement layer that evaluates actions before they execute.

Step-by-Step Instructions

Step 1: Install SafeClaw

npx @authensor/safeclaw

SafeClaw is a standalone action-level gating layer. It has zero third-party dependencies, runs in TypeScript strict mode, and is backed by 446 tests. The open-source client is MIT-licensed.

Step 2: Get Your API Key

Visit safeclaw.onrender.com. Free tier with 7-day renewable key, no credit card. The dashboard includes a policy wizard that generates rules based on your agent's framework.

Step 3: Understand the Four Action Types

SafeClaw gates four categories of agent actions:

| Action Type | What It Covers | Example |
|---|---|---|
| file_read | Any file the agent tries to read | Reading ~/.ssh/id_rsa |
| file_write | Any file the agent tries to create or modify | Writing to ./output/report.md |
| shell_exec | Any shell command the agent tries to run | Running npm install |
| network | Any outbound network request | Calling api.openai.com |

Step 4: Design Your Permission Model

Start with deny-by-default. Then add explicit allow rules for each action your agent legitimately needs. Use require_approval for actions that are sometimes legitimate but should have human oversight.

Principle of least privilege: Give the agent the minimum permissions needed to do its job. If the agent only needs to read files in ./data/ and write to ./output/, don't allow filesystem-wide access.

Step 5: Write Your Policy

Create safeclaw.yaml in your project root. Rules are evaluated using first-match-wins: the first rule that matches the action determines the decision.

Step 6: Test with Simulation Mode

SAFECLAW_MODE=simulation npx @authensor/safeclaw

Run your agent through all its workflows. Review the simulation log to verify that every legitimate action is allowed and every dangerous action is blocked.

Step 7: Enforce

SAFECLAW_MODE=enforce npx @authensor/safeclaw

Policy evaluation is sub-millisecond, so your agent's performance is unaffected. Every decision is logged to a tamper-proof audit trail using SHA-256 hash chains.

Example Policy

version: "1.0" default: deny rules: # ---- FILE PERMISSIONS ---- # Read: project files only - action: file_read path: "./src/**" decision: allow reason: "Read source code" - action: file_read path: "./data/**" decision: allow reason: "Read project data" - action: file_read path: "./package.json" decision: allow reason: "Read package manifest" # Read: block sensitive paths - action: file_read path: "*/.env" decision: deny reason: "Never read credentials" - action: file_read path: "~/.ssh/**" decision: deny reason: "Never read SSH keys" # Write: output directory only - action: file_write path: "./output/**" decision: allow reason: "Write results to output" - action: file_write path: "./src/**" decision: require_approval reason: "Source edits need human review" # ---- SHELL PERMISSIONS ---- - action: shell_exec command: "npm test*" decision: allow reason: "Run tests" - action: shell_exec command: "npm run build*" decision: allow reason: "Build project" - action: shell_exec command: "git log*" decision: allow reason: "View git history" - action: shell_exec command: "git diff*" decision: allow reason: "View diffs" - action: shell_exec command: "rm *" decision: deny reason: "No file deletion" - action: shell_exec command: "npm publish*" decision: deny reason: "No publishing" # ---- NETWORK PERMISSIONS ---- - action: network domain: "api.anthropic.com" decision: allow reason: "Claude API" - action: network domain: "registry.npmjs.org" decision: allow reason: "npm registry lookups"

- action: network domain: "*" decision: deny reason: "Block all other domains"

What Happens When It Works

ALLOW — Agent reads a source file within its permitted scope:

{
  "action": "file_read",
  "path": "./src/utils/parser.ts",
  "decision": "ALLOW",
  "rule": "Read source code",
  "timestamp": "2026-02-13T13:45:01Z",
  "hash": "p6q7r8s9..."
}

DENY — Agent tries to publish your package to npm:

{
  "action": "shell_exec",
  "command": "npm publish --access public",
  "decision": "DENY",
  "rule": "No publishing",
  "timestamp": "2026-02-13T13:45:03Z",
  "hash": "t0u1v2w3..."
}

REQUIRE_APPROVAL — Agent wants to edit a source file:

{
  "action": "file_write",
  "path": "./src/index.ts",
  "decision": "REQUIRE_APPROVAL",
  "rule": "Source edits need human review",
  "timestamp": "2026-02-13T13:45:05Z",
  "hash": "x4y5z6a7..."
}

Common Mistakes

Defining permissions that are too broad. A rule like file_read: path: "**" allows the agent to read every file on the system, including credentials, SSH keys, and browser cookies. Be specific: list exactly which directories and files the agent needs access to.

Confusing OS-level permissions with agent permissions. Your agent runs as your user. Linux file permissions and macOS sandboxing protect against other users, not against your own processes. Agent permissions must be enforced at the application layer, above the OS, which is exactly what SafeClaw does.

Not using require_approval for gray-area actions. Not every action is clearly safe or clearly dangerous. Operations like editing source code, installing packages, or making API calls to third-party services benefit from human-in-the-loop approval. Use require_approval to get oversight without fully blocking the agent.

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw