2026-01-26 · Authensor

How to Safely Run AutoGen Agents

To safely run AutoGen agents, add SafeClaw action-level gating. Install with npx @authensor/safeclaw and define a deny-by-default policy that controls which code blocks get executed, which files agents can access, and which network requests are permitted. AutoGen by Microsoft enables multi-agent conversations where agents generate and execute code, call functions, and interact with external systems — all driven by conversational back-and-forth between agents that runs without human input.

What AutoGen Agents Can Do (And Why That's Risky)

AutoGen's architecture is built around conversational agents that generate and execute code. The risks are specific:

The core problem: AutoGen's UserProxyAgent auto-executes whatever code the AssistantAgent generates. There is no per-action policy layer between code generation and code execution.

Step-by-Step Setup

Step 1: Install SafeClaw

npx @authensor/safeclaw

Select SDK Wrapper as the integration type.

Step 2: Get Your API Key

Visit safeclaw.onrender.com. Free-tier keys renew every 7 days, no credit card required. Use the dashboard wizard for initial policy generation.

Step 3: Wrap the Code Executor

SafeClaw intercepts code execution before it reaches the OS:

import { SafeClaw } from "@authensor/safeclaw";

const safeclaw = new SafeClaw({
apiKey: process.env.SAFECLAW_API_KEY,
policy: "./safeclaw.policy.yaml",
});

// Wrap the code executor to evaluate each code block
const originalExecutor = new LocalCommandLineCodeExecutor({
work_dir: "./workspace",
});

const guardedExecutor = safeclaw.wrapExecutor(originalExecutor, {
// SafeClaw analyzes the code block for:
// - file_read / file_write operations
// - shell_exec commands (subprocess, os.system)
// - network requests (requests, urllib, socket)
analyzeCode: true,
});

const userProxy = new UserProxyAgent({
name: "executor",
human_input_mode: "NEVER",
code_execution_config: {
executor: guardedExecutor,
},
});

Step 4: Wrap Registered Functions

// Instead of raw function registration:
const guardedFunction = safeclaw.guard("shell_exec", async (params) => {
  return await runDatabaseQuery(params.query);
});

assistant.register_function({
name: "query_database",
func: guardedFunction,
});

Step 5: Define Your Policy

version: 1
default: deny

rules:
- action: file_read
path: "${PROJECT_DIR}/workspace/**"
effect: allow

- action: file_read
path: "${PROJECT_DIR}/data/**"
effect: allow

- action: file_read
path: "*/.env"
effect: deny

- action: file_read
path: "*/."
effect: deny

- action: file_write
path: "${PROJECT_DIR}/workspace/**"
effect: allow

- action: file_write
path: "${PROJECT_DIR}/output/**"
effect: allow

- action: shell_exec
command: "python*"
effect: allow

- action: shell_exec
command: "pip*"
effect: deny

- action: shell_exec
command: "rm*"
effect: deny

- action: shell_exec
command: "curl*"
effect: deny

- action: shell_exec
command: "wget*"
effect: deny

- action: network
host: "api.openai.com"
effect: allow

- action: network
host: "localhost"
effect: allow

- action: network
host: "*"
effect: deny

Step 6: Simulate Before Enforcing

npx @authensor/safeclaw simulate --policy safeclaw.policy.yaml

Run a typical AutoGen conversation. Review every verdict in the log. Tighten rules where needed.

What Gets Blocked, What Gets Through

ALLOWED — Agent writes analysis code to workspace:

{ "action": "file_write", "path": "/project/workspace/analyze.py", "verdict": "ALLOW" }

DENIED — Generated code reads SSH keys:

{ "action": "file_read", "path": "/home/user/.ssh/id_rsa", "verdict": "DENY", "reason": "path matches */.  deny rule" }

ALLOWED — Agent runs generated Python script:

{ "action": "shell_exec", "command": "python workspace/analyze.py", "verdict": "ALLOW" }

DENIED — Generated code installs a package:

{ "action": "shell_exec", "command": "pip install cryptography", "verdict": "DENY", "reason": "pip* matches deny rule" }

DENIED — Generated code POSTs to external server:

{ "action": "network", "host": "webhook.site", "verdict": "DENY", "reason": "host not in allowlist, default deny" }

Without SafeClaw vs With SafeClaw

| Scenario | Without SafeClaw | With SafeClaw |
|---|---|---|
| AssistantAgent generates code that reads /etc/passwd | Code executes, file contents returned to agent | Blocked — path outside allowed workspace and data directories |
| Generated code runs os.system("rm -rf /") | Command executes with process permissions | Blockedrm* matches deny rule |
| Agent code imports requests and POSTs to unknown URL | HTTP request sent with whatever data the agent collected | Blocked — host not in network allowlist |
| Agent writes output file to workspace/results.csv | File written normally | Allowedworkspace/** is in write allowlist |
| Multi-turn conversation runs 30 code blocks | All 30 execute without review | Each of the 30 blocks is individually evaluated against policy |

Every evaluation is logged to SafeClaw's tamper-proof audit trail (SHA-256 hash chain). The control plane sees only action metadata — never your code, data, or API keys. SafeClaw runs with zero third-party dependencies, evaluates in sub-millisecond time, and is validated by 446 tests under TypeScript strict mode. The client is 100% open source, MIT licensed.

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw