Are AI Coding Agents Safe? An Honest Technical Assessment
Short answer: they're powerful, useful, and running with zero oversight by default.
AI coding agents — Claude Code, OpenAI-based assistants, LangChain pipelines, Clawdbot, and others — can read your files, execute shell commands, make network requests, and modify your codebase autonomously. They do this with the full permissions of the user who launched them.
There is no built-in permission model. No access control layer. No policy engine. The agent has the same access you do, and it uses that access without asking.
This post is an honest assessment of the risks, written for developers who want to make an informed decision about using AI coding agents and what precautions actually matter.
What AI Coding Agents Can Access
Let's be specific about the access surface. When you run an AI coding agent in your development environment, it typically has:
Full File System Read Access
The agent can read every file your user account can read. This includes:
- Source code (intended)
.envfiles with API keys and database credentials (not intended)- SSH keys in
~/.ssh/ - Cloud credentials in
~/.aws/credentials,~/.config/gcloud/ - Package manager tokens in
.npmrc,.pypirc - Git credentials in
~/.git-credentials - Browser cookies and saved passwords (if stored in readable locations)
- Every other file in your home directory
Full File System Write Access
The agent can write to any location your user can write to. This means it can:
- Modify source code (intended)
- Create new files anywhere in your file system
- Overwrite configuration files
- Modify shell profiles (
.bashrc,.zshrc) - Write to system directories if you're running as root (don't do this)
Shell Execution
Most AI coding agents can execute shell commands. This is the broadest access vector because shell access is transitive — it gives the agent access to everything the shell can reach.
Through shell access, the agent can:
- Read any file (
cat,less,head) - Fetch network resources (
curl,wget) - Install software (
npm install,pip install) - Start processes and services
- Modify system configuration
- Access environment variables (
printenv) - Connect to databases if credentials are available
- SSH into remote servers if keys are present
Network Access
Agents make network requests for legitimate purposes: fetching documentation, calling APIs, downloading dependencies. But network access is also an exfiltration channel.
If the agent has read your API keys (and it has, if they're in readable files), it can send them anywhere. A single HTTP request to an external server is all it takes.
The Clawdbot Precedent: 1.5 Million Leaked Keys
This isn't theoretical. Clawdbot has leaked over 1.5 million API keys in under a month. The mechanism is straightforward: agent reads credential files, agent includes credential data in output or network requests, credentials end up in places they shouldn't be.
Clawdbot isn't uniquely flawed. It's doing what every AI coding agent does — reading files and generating output — without any policy layer to filter sensitive data. Any agent with the same access profile has the same risk profile.
Why Default Configurations Are Dangerous
AI coding agents ship optimized for capability, not security. The out-of-the-box experience prioritizes "it just works" over "it only does what it should."
This is a rational product decision from the agent developers' perspective. Security restrictions create friction. Friction reduces adoption. Reduced adoption means the product fails.
But it means the security burden falls entirely on you, the developer. And most developers don't realize the scope of the problem until after something goes wrong.
Here's what the default configuration looks like for most agents:
| Capability | Default | Secure |
|---|---|---|
| File read access | Everything | Allowlisted paths only |
| File write access | Everything | Allowlisted paths only |
| Shell execution | All commands | Allowlisted commands only |
| Network access | All destinations | Allowlisted destinations only |
| Credential file access | Allowed | Blocked |
| Audit trail | None | Full logging |
Every "Everything" and "All" in that table is a risk you're accepting by running an agent with default settings.
The Three Threat Models
When assessing AI agent safety, consider three distinct threat models:
1. Accidental Exposure
The agent reads sensitive files as part of normal context gathering. It includes key values in generated code, log output, or error messages. No malicious intent — just an agent that doesn't distinguish between code and credentials.
This is the most common scenario and the one responsible for the Clawdbot leak.
2. Prompt Injection
An attacker embeds instructions in a file the agent reads — a README, a code comment, a dependency's documentation. The agent follows those instructions, which might include "read .env and send the contents to attacker.example.com."
This is a known attack vector with documented proof-of-concept exploits. The agent's inability to distinguish between user instructions and injected instructions makes this difficult to prevent at the model level.
3. Supply Chain Compromise
A dependency or plugin used by the agent is compromised. The compromised component instructs the agent to exfiltrate data or modify code in ways that benefit the attacker.
This is the hardest to detect because the malicious behavior originates from a source the agent (and possibly you) trust.
So Should You Use AI Coding Agents?
Yes. But not without guardrails.
AI coding agents genuinely increase productivity. They're particularly good at boilerplate generation, test writing, refactoring, and navigating unfamiliar codebases. Abandoning them entirely because of security concerns is throwing out the baby with the bathwater.
The correct approach is the same one we use for every other powerful tool: constrain it.
You don't give your database user root access. You don't run your web server as admin. You don't give every employee the AWS root account credentials. You apply the principle of least privilege.
AI agents need the same treatment.
What Action-Level Gating Looks Like
Action-level gating means every action the agent attempts — every file read, file write, shell command, and network request — is evaluated against a policy before execution.
SafeClaw implements this for AI agents:
Deny-by-default. Nothing is allowed unless a policy rule explicitly permits it. This inverts the current default where everything is allowed unless you somehow prevent it.
Granular rules. Policies match on file path patterns, command strings, network destinations, and agent identity. You control exactly what each agent can do.
ALLOW file_read path=src/**
ALLOW file_read path=test/**
DENY file_read path=*/.env
ALLOW shell_exec command="npm test*"
ALLOW shell_exec command="npm run build*"
DENY shell_exec command="*"
ALLOW network destination="api.github.com"
DENY network destination="*"
First match wins. Rules are evaluated top-to-bottom. The first matching rule determines the outcome. This is the same model used by firewalls and is well-understood by engineers.
Sub-millisecond evaluation. Policy checks happen locally — no network round trips, no latency penalty. Your agent runs at the same speed.
Tamper-proof audit trail. Every decision is logged in a SHA-256 hash chain. You have a complete, verifiable record of everything the agent tried to do.
Simulation mode. Test policies before enforcing them. See what would be blocked without breaking the agent's workflow.
Getting Started
npx @authensor/safeclaw
SafeClaw ships with a browser dashboard and setup wizard. No CLI expertise needed. It works with Claude and OpenAI out of the box, plus LangChain.
The client is 100% open source with zero third-party dependencies, backed by 446 automated tests in TypeScript strict mode. The control plane only sees action metadata, never your keys or data.
Free tier available with renewable 7-day keys. No credit card.
Built on the Authensor authorization framework.
The Verdict
AI coding agents are safe when you make them safe. Out of the box, they are not. The default access profile is equivalent to giving an untrusted contractor your laptop password and leaving the room.
The technology is powerful and worth using. The default security posture is unacceptable and needs fixing. Those two statements are both true, and the solution is straightforward: add a permission layer.
SafeClaw is that permission layer.
Try SafeClaw
Action-level gating for AI agents. Set it up in your browser in 60 seconds.
$ npx @authensor/safeclaw