2025-12-11 · Authensor

Deny-by-Default vs Allow-by-Default: The Only Sane Security Model for AI Agents

Every AI agent you install today starts with the same assumption: it can do everything.

Write any file. Execute any shell command. Make any network request. The default is full access. If you want restrictions, you add them after the fact. Maybe. If you know what to restrict. If you even realize restrictions are possible.

This is allow-by-default. It is the dominant security model for AI agents in 2026. And it is fundamentally broken.

What Allow-by-Default Actually Means

Allow-by-default means the agent has permission to perform any action unless something explicitly blocks it. You start at full access and try to carve out restrictions.

In practice, this means:

The problem is obvious. You cannot anticipate every dangerous action in advance. You cannot write a complete blacklist. You will always miss something.

Clawdbot leaked 1.5 million API keys in under a month. The users who installed it gave it allow-by-default access to their environment. They did not know they needed to restrict it. They did not know what to restrict. The tool operated with full permissions and it failed catastrophically.

What Deny-by-Default Actually Means

Deny-by-default inverts the model. The agent starts with zero permissions. Every action is blocked. You explicitly grant access to specific actions, specific paths, and specific destinations.

If you did not write a rule allowing it, it does not happen. Period.

This is how SafeClaw works. When you install SafeClaw, the initial policy is empty. The agent cannot write any files. It cannot execute any shell commands. It cannot make any network requests. You build up from nothing.

# SafeClaw default state: everything denied
file_write  → DENY (no rules defined)
shell_exec  → DENY (no rules defined)
network     → DENY (no rules defined)

You then add rules for exactly what the agent needs:

# Allow file writes only to your project directory
file_write to ~/projects/myapp/** → ALLOW

Allow shell execution of npm commands only

shell_exec matching "npm *" → ALLOW

Allow network requests to your API only

network to api.myservice.com → ALLOW

Everything else remains denied

The agent can now write files in your project directory, run npm commands, and talk to your API. It cannot write to /etc/passwd. It cannot run sudo rm -rf /. It cannot exfiltrate data to an unknown server. Not because you thought to block those things, but because you never allowed them.

Why Blacklists Always Fail

Allow-by-default security relies on blacklists. You list the things you do not want to happen and allow everything else. This approach has been tried and has failed across every domain in computing.

Email spam filters started with blacklists. Block known spam domains. Spammers registered new domains. The blacklist was always behind.

Web application firewalls started with blacklists. Block known SQL injection patterns. Attackers found new patterns. The blacklist was always incomplete.

Antivirus software started with blacklists. Detect known malware signatures. New malware appeared daily. The blacklist was always stale.

The pattern is the same every time. Blacklists are reactive. They protect against known threats. They fail against novel ones. And AI agents, by their nature, generate novel behavior. That is the entire point. An agent that only did predictable things would not be useful.

You cannot blacklist unpredictable behavior. You can only whitelist expected behavior.

The Firewall Precedent

Network firewalls solved this problem decades ago. A properly configured firewall denies all inbound traffic by default. You open specific ports for specific services. Port 443 for HTTPS. Port 22 for SSH. Everything else is dropped.

Nobody configures a production firewall by allowing all traffic and then trying to block bad packets. That would be insane. Yet this is exactly how most AI agents operate.

SafeClaw applies the same principle. Deny all actions by default. Allow specific actions through specific rules. The security model is not new. Applying it to AI agents is.

What Happens When the Control Plane Is Unreachable

This is where most deny-by-default implementations fail. They evaluate policies on a remote server. If that server goes down, the tool has to decide: do we block everything or allow everything?

Many tools fail open. If the policy server is unreachable, they allow actions to proceed. This converts deny-by-default into allow-by-default the moment there is a network issue.

SafeClaw evaluates policies locally. Sub-millisecond. No network round trips in the critical path. If the Authensor control plane is unreachable, SafeClaw does not fail open. Everything stays blocked. Deny-by-default is maintained regardless of network state.

# Policy evaluation is local

No network dependency for gating decisions

Unreachable control plane = continued denial

Building Up From Zero

The deny-by-default approach does require more upfront work. You need to define what the agent can do before it can do anything. This is intentional.

SafeClaw makes this straightforward with its browser dashboard and setup wizard:

npx @authensor/safeclaw

The setup wizard walks you through creating your first policy. It asks what your agent needs to do and generates rules accordingly. You start with a minimal permission set and expand as needed.

Simulation mode lets you test policies without enforcing them. Every action gets logged as "would allow" or "would deny." You can see exactly what your policy permits before it goes live.

This is the workflow:

  1. Start with deny-all
  2. Run your agent in simulation mode
  3. Review the log of would-be actions
  4. Add ALLOW rules for legitimate actions
  5. Switch to enforcement mode
  6. Monitor the tamper-proof audit trail
You build up permissions based on observed behavior rather than guessing what to block.

The Comparison

| Aspect | Allow-by-Default | Deny-by-Default (SafeClaw) |
|---|---|---|
| Starting state | Full access | Zero access |
| Rule approach | Blacklist (block bad) | Whitelist (allow good) |
| Unknown actions | Allowed | Denied |
| Novel agent behavior | Unchecked | Blocked until reviewed |
| Failure mode | Agent does unexpected things | Agent does nothing until approved |
| Setup effort | Low (and dangerous) | Moderate (and safe) |

Why This Is the Only Sane Model

An AI agent that can do anything will eventually do something you did not want. Not out of malice. Out of misinterpretation, hallucination, or unexpected tool use. The question is not whether it will happen. The question is whether it will be blocked when it does.

Allow-by-default answers that question with "probably not."

Deny-by-default answers it with "definitely."

SafeClaw implements deny-by-default at the action level. Every file_write, shell_exec, and network request is evaluated against your policy before execution. 446 tests in TypeScript strict mode. Zero dependencies. Sub-millisecond evaluation. Tamper-proof audit trail.

The free tier is available with 7-day renewable keys. No credit card required.

npx @authensor/safeclaw

Stop starting from full access and trying to restrict. Start from zero and build up.


SafeClaw is built on Authensor. Try it at safeclaw.onrender.com.

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw