2026-01-22 · Authensor

How to Set Up Human Approval for AI Agent Actions

To require human approval before your AI agent executes high-risk actions, use SafeClaw's REQUIRE_APPROVAL policy effect. When an agent attempts an action matching a require_approval rule, SafeClaw suspends execution and waits for a human to approve or reject the action. The agent cannot proceed until the human responds. This is human-in-the-loop (HITL) control at the action layer. Install with npx @authensor/safeclaw.

When to Require Approval

Not every action needs human review. Approval workflows introduce latency — the agent pauses until a human responds. Apply require_approval selectively to actions where the cost of a mistake is high:

Destructive operations. File deletions, directory removals, database drops, and git push --force cannot be undone. A human review step prevents irreversible damage.

Production deployments. Deploying code, pushing to production branches, or modifying infrastructure should require explicit human sign-off.

External API calls with side effects. Sending emails, processing payments, creating accounts, or posting to public APIs have real-world consequences that warrant review.

Credential file access. While credential reads should generally be denied outright, some workflows require agents to access specific secrets under supervision.

Network requests to new domains. Outbound connections to domains not in the standard allow list may indicate data exfiltration or unintended behavior.

Step 1: Install SafeClaw

npx @authensor/safeclaw

Zero third-party dependencies. 100% open source client, MIT license. 446 tests in TypeScript strict mode.

Step 2: Get Your Free API Key

Visit safeclaw.onrender.com. Free tier with 7-day renewable key, no credit card required. The browser dashboard is where you will review and approve pending actions.

Step 3: Define Your Policy with Approval Rules

A realistic policy uses all three decision types: allow for safe routine actions, deny for actions that should never happen, and require_approval for actions that are sometimes legitimate but need human judgment.

version: "1.0" default: deny rules: # ── ALWAYS ALLOWED (routine, low-risk) ───────────────────── # Read source code and documentation - action: file_read path: "./src/**" decision: allow reason: "Reading source code is routine" - action: file_read path: "./docs/**" decision: allow reason: "Reading documentation is routine" - action: file_read path: "./tests/**" decision: allow reason: "Reading tests is routine" - action: file_read path: "./*.json" decision: allow reason: "Reading config files is routine" # Write to output and build directories - action: file_write path: "./output/**" decision: allow reason: "Writing output is routine" - action: file_write path: "./dist/**" decision: allow reason: "Build artifacts are routine" # Safe shell commands - action: shell_exec command: "npm test" decision: allow reason: "Running tests is routine" - action: shell_exec command: "npm run lint" decision: allow reason: "Linting is routine" - action: shell_exec command: "npm run build" decision: allow reason: "Building is routine" - action: shell_exec command: "git status" decision: allow reason: "Git status is read-only" - action: shell_exec command: "git diff*" decision: allow reason: "Git diff is read-only" # Known safe API endpoints - action: network domain: "api.openai.com" decision: allow reason: "LLM API calls are expected" # ── ALWAYS DENIED (never permitted) ──────────────────────── # Credential files - action: file_read path: "**/.env" decision: deny reason: "Credential files are never accessible" - action: file_read path: "*/.env." decision: deny reason: "Environment variants are never accessible" - action: file_read path: "~/.ssh/**" decision: deny reason: "SSH keys are never accessible" - action: file_read path: "~/.aws/**" decision: deny reason: "Cloud credentials are never accessible" # Destructive commands - action: shell_exec command: "rm -rf*" decision: deny reason: "Recursive deletion is never allowed" - action: shell_exec command: "sudo*" decision: deny reason: "Privilege escalation is never allowed" - action: shell_exec command: "chmod 777*" decision: deny reason: "Permissive chmod is never allowed" # ── REQUIRE APPROVAL (legitimate but risky) ──────────────── # Source code modifications - action: file_write path: "./src/**" decision: require_approval reason: "Source changes need human review" # Test modifications - action: file_write path: "./tests/**" decision: require_approval reason: "Test changes need human review" # Config file modifications - action: file_write path: "./*.json" decision: require_approval reason: "Config changes need human review" - action: file_write path: "./*.yaml" decision: require_approval reason: "Config changes need human review" # Git operations that modify the repository - action: shell_exec command: "git commit*" decision: require_approval reason: "Commits need human review" - action: shell_exec command: "git push*" decision: require_approval reason: "Pushes need human review" - action: shell_exec command: "git merge*" decision: require_approval reason: "Merges need human review" # Package installation - action: shell_exec command: "npm install*" decision: require_approval reason: "Package installs need human review" - action: shell_exec command: "pip install*" decision: require_approval reason: "Package installs need human review"

# Network requests to unknown domains - action: network domain: "*" decision: require_approval reason: "Outbound requests to unknown domains need review"

Step 4: Understand the Approval Workflow

When the agent triggers a require_approval rule, the following sequence occurs:

Agent attempts an action — e.g., file_write to ./src/utils.ts
SafeClaw matches the rule — The require_approval rule for ./src/** writes fires
Action is suspended — The agent receives a pending status and cannot proceed
Notification appears in the dashboard — The pending action is visible at safeclaw.onrender.com with full details: action type, path, agent name, timestamp, and the rule that triggered the hold
Human reviews and decides — The reviewer clicks Approve or Deny
Decision is returned — If approved, the action executes. If denied, the agent receives a denial
Decision is logged — The audit trail records the action, the decision, the reviewer, and the timestamp in the SHA-256 hash chain

The entire approval decision is part of the tamper-proof audit trail. This provides compliance evidence that a human reviewed and authorized the action.

Step 5: Test with Simulation Mode

SAFECLAW_MODE=simulation npx @authensor/safeclaw

In simulation mode, require_approval rules are logged as "WOULD REQUIRE APPROVAL" without actually suspending the agent. This lets you verify which actions trigger approval requests before you deploy the workflow. Review the logs to confirm that:

Routine actions are allowed without approval
Dangerous actions are denied outright
Moderate-risk actions trigger approval requests

Step 6: Switch to Enforce Mode

SAFECLAW_MODE=enforce npx @authensor/safeclaw

Now require_approval rules actually suspend agent execution and wait for human response. Policy evaluation remains sub-millisecond — the only latency introduced is the time the human takes to respond.

Balancing Autonomy and Safety

Excessive approval requirements defeat the purpose of an AI agent. If every action requires approval, you have a manual tool, not an autonomous assistant. The goal is to apply approval only at decision points where human judgment adds value:

High-frequency, low-risk actions (reading source files, running tests, writing to output) → allow
Actions that should never happen (reading credentials, recursive deletion, sudo) → deny
Low-frequency, high-impact actions (modifying source, deploying, installing packages, external API calls) → require_approval

Start with more approval requirements than you think you need. As you build confidence in the agent's behavior through audit log review, promote stable action patterns from require_approval to allow.

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw