2025-11-27 · Authensor

How to Create AI Agent Security Policies

An AI agent without a security policy is a liability. Clawdbot leaked 1.5 million API keys in under a month -- not because the AI was malicious, but because nobody defined what it was and was not allowed to do. A security policy is that definition.

SafeClaw provides action-level gating for AI agents through structured policies. This guide covers how to think about policies, how to write them, and how to test them before they go live.

The Mental Model: Deny by Default

Start from this principle: your agent can do nothing until you explicitly allow it.

SafeClaw enforces deny-by-default. Before you write a single rule, every file write, shell command, and network request is blocked. This is not a suggestion -- it is the default state. You build your policy by carving out specific exceptions.

This is the same model used by firewalls, OS permissions, and every serious access control system. The alternative -- allow-by-default with specific denies -- requires you to anticipate every possible bad action. That approach fails. You cannot enumerate what you have not imagined.

Policy Structure

A SafeClaw policy is a named collection of rules. Each rule specifies:

action: The type of action (file_write, shell_exec, or network)
effect: allow or deny
pattern: The specific target (path pattern, command, or destination)
agentId (optional): Which agent identity the rule applies to

Rules are evaluated top-to-bottom, first-match-wins. When an agent attempts an action, SafeClaw walks through the rules in order and applies the first rule that matches. If no rule matches, the action is denied (deny-by-default).

This means rule order matters. Place more specific rules above more general ones.

{
  "name": "policy-name",
  "rules": [
    {
      "action": "file_write",
      "effect": "allow",
      "pathPattern": "/project/src/**"
    },
    {
      "action": "file_write",
      "effect": "deny",
      "pathPattern": "/project/.env"
    }
  ]
}

In this example, the .env deny rule will never trigger because the /project/src/** pattern does not match .env (it is in the project root, not under src/). But if you reversed the order and put the deny first, writes to /project/.env would be denied before the allow rule was reached. Think carefully about ordering.

Step 1: Inventory What Your Agent Needs

Before writing rules, answer these questions:

Which files does the agent need to write? List the directories and file types.
Which shell commands does the agent need to run? Think: build tools, test runners, linters.
Which network destinations does the agent need to reach? Package registries, APIs, documentation sites.

Be specific. "It needs to write files" is not a policy -- "It needs to write .ts and .tsx files under /project/src/" is.

Step 2: Write Allow Rules for Each Need

Convert your inventory into rules. Here are patterns for the three action types:

File Write Rules

{
  "action": "file_write",
  "effect": "allow",
  "pathPattern": "/home/user/project/src/*/.ts"
}

Path patterns use glob syntax. * matches any number of directories. matches any filename segment.

Shell Exec Rules

{
  "action": "shell_exec",
  "effect": "allow",
  "command": "npm test"
}

{
  "action": "shell_exec",
  "effect": "allow",
  "command": "npm run build"
}

Be explicit about which commands you allow. Avoid wildcards on shell commands unless absolutely necessary.

Network Rules

{
  "action": "network",
  "effect": "allow",
  "destination": "registry.npmjs.org"
}

Specify exact destinations. If your agent needs to fetch packages, allow the registry. If it calls an API, allow that API domain.

Step 3: Add Explicit Denies for Sensitive Targets

Even with deny-by-default, add explicit deny rules for critical targets. This provides defense-in-depth and makes your security intent clear to anyone reading the policy.

{
  "action": "file_write",
  "effect": "deny",
  "pathPattern": "**/.env"
},
{
  "action": "file_write",
  "effect": "deny",
  "pathPattern": "/.ssh/"
},
{
  "action": "shell_exec",
  "effect": "deny",
  "command": "rm -rf *"
},
{
  "action": "network",
  "effect": "deny",
  "destination": "*.pastebin.com"
}

Place these deny rules above your allow rules so they match first.

Example: Web Development Agent Policy

A full policy for a typical web development agent:

{
  "name": "web-dev-agent",
  "rules": [
    {
      "action": "file_write",
      "effect": "deny",
      "pathPattern": "*/.env"
    },
    {
      "action": "file_write",
      "effect": "deny",
      "pathPattern": "/.ssh/"
    },
    {
      "action": "file_write",
      "effect": "deny",
      "pathPattern": "/.aws/"
    },
    {
      "action": "file_write",
      "effect": "allow",
      "pathPattern": "/home/user/project/src/**"
    },
    {
      "action": "file_write",
      "effect": "allow",
      "pathPattern": "/home/user/project/tests/**"
    },
    {
      "action": "file_write",
      "effect": "allow",
      "pathPattern": "/home/user/project/package.json"
    },
    {
      "action": "shell_exec",
      "effect": "allow",
      "command": "npm install"
    },
    {
      "action": "shell_exec",
      "effect": "allow",
      "command": "npm test"
    },
    {
      "action": "shell_exec",
      "effect": "allow",
      "command": "npm run build"
    },
    {
      "action": "shell_exec",
      "effect": "allow",
      "command": "npx tsc --noEmit"
    },
    {
      "action": "network",
      "effect": "allow",
      "destination": "registry.npmjs.org"
    }
  ]
}

This agent can write source and test files, run standard npm commands, and pull packages from npm. It cannot touch credentials, access SSH keys, or reach arbitrary network destinations.

Example: Data Analysis Agent Policy

{
  "name": "data-analysis-agent",
  "rules": [
    {
      "action": "file_write",
      "effect": "deny",
      "pathPattern": "*/.env"
    },
    {
      "action": "file_write",
      "effect": "allow",
      "pathPattern": "/home/user/analysis/output/**"
    },
    {
      "action": "file_write",
      "effect": "allow",
      "pathPattern": "/home/user/analysis/notebooks/*/.ipynb"
    },
    {
      "action": "shell_exec",
      "effect": "allow",
      "command": "python *.py"
    },
    {
      "action": "shell_exec",
      "effect": "allow",
      "command": "jupyter nbconvert **"
    },
    {
      "action": "network",
      "effect": "deny",
      "destination": "*"
    }
  ]
}

This agent writes output files and notebooks, runs Python scripts, and has no network access at all. For data analysis on local datasets, that is exactly right.

Step 4: Test with Simulation Mode

Before enforcing any policy, enable simulation mode in the SafeClaw dashboard. In simulation mode, SafeClaw evaluates every action against your rules but does not block anything. It logs "would allow" and "would deny" for each action.

Run your agent through a realistic workflow. Then review the simulation log:

Are legitimate actions showing as "would deny"? Your allow rules are too narrow.
Are dangerous actions showing as "would allow"? Your rules are too permissive.

Iterate until the simulation log matches your expectations. Then disable simulation mode and enforce.

Step 5: Iterate and Refine

Policies are not write-once. As your agent's tasks evolve, your policies should evolve with them. SafeClaw's tamper-proof audit trail (SHA-256 hash chain) gives you a complete record of every decision. Review it periodically. Look for:

Denied actions that should have been allowed (friction in your workflow)
Allowed actions that make you uncomfortable (tighten the rules)
Patterns you did not anticipate (new rules needed)

Common Mistakes

Too-broad path patterns. Allowing /** defeats the purpose. Be specific about directories and file types.

Forgetting rule order. First match wins. If a broad allow rule sits above a specific deny rule, the deny will never fire.

No explicit denies for secrets. Deny-by-default handles this, but explicit deny rules for .env, .ssh, and .aws make your intent obvious and guard against accidental over-permissive allow rules.

Skipping simulation mode. Going straight to enforcement means learning your policy is wrong when your agent gets blocked mid-task. Test first.

Getting Started

Install SafeClaw:

npx @authensor/safeclaw

The browser dashboard opens with a setup wizard. Free tier, no credit card, 7-day renewable keys. SafeClaw is built on the Authensor framework -- 446 tests, TypeScript strict mode, zero dependencies. Works with Claude, OpenAI, and LangChain.

Start deny-by-default. Add only what your agent actually needs. Test in simulation. Enforce with confidence. That is how you create AI agent security policies that work.

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw