2026-01-12 · Authensor

AI Agent Incident Response: What to Do When Your Agent Goes Wrong

If your AI agent just deleted files, leaked credentials, made unauthorized API calls, or caused other damage, follow these five steps: stop the agent immediately, assess the damage, recover what you can, investigate what happened, and install prevention controls so it cannot happen again. This guide provides the specific commands and procedures for each step.

Step 1: STOP the Agent Immediately

Kill the agent process. Do not ask it to stop. Do not try to fix the problem with the agent still running — it may cause further damage while you investigate.

Kill by process name:

pkill -f "node.*agent"
pkill -f "python.*agent"

Kill by port (if it runs a server):

lsof -ti:3000 | xargs kill -9

Kill all Node processes (if uncertain which process is the agent):

killall node

Revoke API keys the agent had access to. If the agent could read .env files or credential stores, assume those credentials are compromised. Rotate them now, before continuing to the next step. Go to each service's dashboard (OpenAI, AWS, Stripe, database providers) and regenerate keys.

Step 2: Assess the Damage

Determine what the agent did. Check these sources in order:

Check git status and recent changes:

git status
git log --oneline -20
git diff HEAD~5

If the agent made commits, git log shows them. If it modified tracked files without committing, git diff reveals the changes.

Check recently modified files:

find . -type f -mmin -30 -not -path './.git/*'

This lists every file modified in the last 30 minutes. Narrow the time window based on when you noticed the problem.

Check for deleted files:

git ls-files --deleted

If files were deleted from a git repository, this lists them.

Check SafeClaw audit trail (if installed):
If SafeClaw was running, visit safeclaw.onrender.com and review the audit log. Every action the agent attempted — allowed, denied, or escalated — is recorded with timestamps, action types, paths, and policy decisions. The SHA-256 hash chain ensures these records have not been tampered with.

Check shell history:

history | tail -50

If the agent executed commands through your shell, they may appear in history.

Check network connections (if data exfiltration is suspected):

netstat -an | grep ESTABLISHED
lsof -i -P

Step 3: Recover

Restore deleted or modified files from git:

git checkout -- path/to/file
git restore path/to/file

Restore to a specific commit:

git checkout abc1234 -- path/to/file

Restore from Time Machine (macOS):
Navigate to the directory in Finder, enter Time Machine, and restore the files from before the incident.

Restore from backups:
If you use automated backups (rsync, Backblaze, AWS S3), restore files from the most recent pre-incident snapshot.

If credentials were exposed:

Rotate every key the agent could have accessed

Check service logs for unauthorized usage of those keys

If API keys were used by an attacker, contact the service provider to report unauthorized access

Review billing dashboards for unexpected charges

The Clawdbot incident exposed 1.5 million API keys because credential files were readable and network access was unrestricted. Many of those keys were used for unauthorized API calls before they were rotated. Speed matters — rotate immediately.

Step 4: Investigate

Once the immediate damage is contained, determine the root cause.

What action caused the damage? Was it a file deletion, an unauthorized write, a destructive shell command, or a network exfiltration? Knowing the action type tells you which control was missing.

What triggered the action? Did the agent misinterpret an instruction? Was it prompt-injected by malicious content in a file it read? Did it hallucinate a step in its plan? Review the agent's conversation log or execution trace.

What permissions did the agent have? Could the agent access credential files? Did it have unrestricted shell access? Could it make arbitrary network requests? Map the agent's actual capabilities against what it needed.

Was there any safety layer in place? If no action-level gating was installed, the agent had the same permissions as the user account it ran under — full access to everything.

Step 5: Prevent Recurrence

Install action-level gating so the agent cannot repeat the incident.

Install SafeClaw:

npx @authensor/safeclaw

Get your free API key at safeclaw.onrender.com (7-day renewable, no credit card).

Create a deny-by-default policy that explicitly allows only the actions your agent needs:

version: "1.0" default: deny rules: # Allow reading source code only - action: file_read path: "./src/**" decision: allow reason: "Agent can read source files" # Allow writing to output directory only - action: file_write path: "./output/**" decision: allow reason: "Agent can write to output only" # Block credential files explicitly - action: file_read path: "**/.env" decision: deny reason: "Block credential file access" - action: file_read path: "/.ssh/" decision: deny reason: "Block SSH key access" # Allow only specific safe commands - action: shell_exec command: "npm test" decision: allow reason: "Agent can run tests" - action: shell_exec command: "npm run build" decision: allow reason: "Agent can run builds" # Require human approval for deployments - action: shell_exec command: "git push*" decision: require_approval reason: "Deployments need human review" # Block all network except known APIs - action: network domain: "api.openai.com" decision: allow reason: "Agent can call OpenAI"

- action: network domain: "*" decision: deny reason: "Block all other outbound traffic"

Run simulation mode first to verify the policy handles your agent's workflows:

SAFECLAW_MODE=simulation npx @authensor/safeclaw

Then enforce:

SAFECLAW_MODE=enforce npx @authensor/safeclaw

SafeClaw evaluates every action in sub-millisecond time. The policy engine has 446 tests, runs in TypeScript strict mode, and has zero third-party dependencies. Every action decision is recorded in a tamper-proof audit trail (SHA-256 hash chain), giving you the investigation data you lacked during this incident.

After the Incident

Document what happened, what the impact was, and what controls you put in place. If the incident affected users or customers, follow your organization's disclosure procedures. If credentials were leaked, monitor for unauthorized usage for at least 30 days after rotation.

The goal is not zero incidents — it is ensuring that the same class of incident cannot happen twice. Deny-by-default gating ensures that any action not explicitly permitted is blocked, covering both the specific failure mode you experienced and the ones you have not yet imagined.

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw