How to Safely Run AutoGen Agents
To safely run AutoGen agents, add SafeClaw action-level gating. Install with npx @authensor/safeclaw and define a deny-by-default policy that controls which code blocks get executed, which files agents can access, and which network requests are permitted. AutoGen by Microsoft enables multi-agent conversations where agents generate and execute code, call functions, and interact with external systems — all driven by conversational back-and-forth between agents that runs without human input.
What AutoGen Agents Can Do (And Why That's Risky)
AutoGen's architecture is built around conversational agents that generate and execute code. The risks are specific:
- Code execution via CodeExecutor — AutoGen extracts code blocks from agent messages and runs them.
LocalCommandLineCodeExecutorruns code directly on your machine with full OS access.DockerCommandLineCodeExecutorruns in a container but still has network access and mounted volumes. - Multi-agent conversations without human checkpoints —
ConversableAgentinstances talk to each other in loops. AnAssistantAgentgenerates code, aUserProxyAgent(withhuman_input_mode="NEVER") executes it automatically. Conversations can run for dozens of turns. - Function registration — you register Python functions that agents can call. These functions run in your process with your credentials and filesystem access.
- Nested conversations — AutoGen supports nested chat patterns where agents spawn sub-conversations with other agents, creating execution chains that are hard to trace.
- File I/O through generated code — agents write Python code that reads and writes files. The code itself is generated at runtime, so you cannot predict at design time what file paths it will access.
- Network requests through generated code — agents frequently generate code that imports
requestsorurlliband makes HTTP calls to arbitrary URLs. These happen inside code execution, not through a dedicated network tool.
UserProxyAgent auto-executes whatever code the AssistantAgent generates. There is no per-action policy layer between code generation and code execution.
Step-by-Step Setup
Step 1: Install SafeClaw
npx @authensor/safeclaw
Select SDK Wrapper as the integration type.
Step 2: Get Your API Key
Visit safeclaw.onrender.com. Free-tier keys renew every 7 days, no credit card required. Use the dashboard wizard for initial policy generation.
Step 3: Wrap the Code Executor
SafeClaw intercepts code execution before it reaches the OS:
import { SafeClaw } from "@authensor/safeclaw";
const safeclaw = new SafeClaw({
apiKey: process.env.SAFECLAW_API_KEY,
policy: "./safeclaw.policy.yaml",
});
// Wrap the code executor to evaluate each code block
const originalExecutor = new LocalCommandLineCodeExecutor({
work_dir: "./workspace",
});
const guardedExecutor = safeclaw.wrapExecutor(originalExecutor, {
// SafeClaw analyzes the code block for:
// - file_read / file_write operations
// - shell_exec commands (subprocess, os.system)
// - network requests (requests, urllib, socket)
analyzeCode: true,
});
const userProxy = new UserProxyAgent({
name: "executor",
human_input_mode: "NEVER",
code_execution_config: {
executor: guardedExecutor,
},
});
Step 4: Wrap Registered Functions
// Instead of raw function registration:
const guardedFunction = safeclaw.guard("shell_exec", async (params) => {
return await runDatabaseQuery(params.query);
});
assistant.register_function({
name: "query_database",
func: guardedFunction,
});
Step 5: Define Your Policy
version: 1
default: deny
rules:
- action: file_read
path: "${PROJECT_DIR}/workspace/**"
effect: allow
- action: file_read
path: "${PROJECT_DIR}/data/**"
effect: allow
- action: file_read
path: "*/.env"
effect: deny
- action: file_read
path: "*/."
effect: deny
- action: file_write
path: "${PROJECT_DIR}/workspace/**"
effect: allow
- action: file_write
path: "${PROJECT_DIR}/output/**"
effect: allow
- action: shell_exec
command: "python*"
effect: allow
- action: shell_exec
command: "pip*"
effect: deny
- action: shell_exec
command: "rm*"
effect: deny
- action: shell_exec
command: "curl*"
effect: deny
- action: shell_exec
command: "wget*"
effect: deny
- action: network
host: "api.openai.com"
effect: allow
- action: network
host: "localhost"
effect: allow
- action: network
host: "*"
effect: deny
Step 6: Simulate Before Enforcing
npx @authensor/safeclaw simulate --policy safeclaw.policy.yaml
Run a typical AutoGen conversation. Review every verdict in the log. Tighten rules where needed.
What Gets Blocked, What Gets Through
ALLOWED — Agent writes analysis code to workspace:
{ "action": "file_write", "path": "/project/workspace/analyze.py", "verdict": "ALLOW" }
DENIED — Generated code reads SSH keys:
{ "action": "file_read", "path": "/home/user/.ssh/id_rsa", "verdict": "DENY", "reason": "path matches */. deny rule" }
ALLOWED — Agent runs generated Python script:
{ "action": "shell_exec", "command": "python workspace/analyze.py", "verdict": "ALLOW" }
DENIED — Generated code installs a package:
{ "action": "shell_exec", "command": "pip install cryptography", "verdict": "DENY", "reason": "pip* matches deny rule" }
DENIED — Generated code POSTs to external server:
{ "action": "network", "host": "webhook.site", "verdict": "DENY", "reason": "host not in allowlist, default deny" }
Without SafeClaw vs With SafeClaw
| Scenario | Without SafeClaw | With SafeClaw |
|---|---|---|
| AssistantAgent generates code that reads /etc/passwd | Code executes, file contents returned to agent | Blocked — path outside allowed workspace and data directories |
| Generated code runs os.system("rm -rf /") | Command executes with process permissions | Blocked — rm* matches deny rule |
| Agent code imports requests and POSTs to unknown URL | HTTP request sent with whatever data the agent collected | Blocked — host not in network allowlist |
| Agent writes output file to workspace/results.csv | File written normally | Allowed — workspace/** is in write allowlist |
| Multi-turn conversation runs 30 code blocks | All 30 execute without review | Each of the 30 blocks is individually evaluated against policy |
Every evaluation is logged to SafeClaw's tamper-proof audit trail (SHA-256 hash chain). The control plane sees only action metadata — never your code, data, or API keys. SafeClaw runs with zero third-party dependencies, evaluates in sub-millisecond time, and is validated by 446 tests under TypeScript strict mode. The client is 100% open source, MIT licensed.
Cross-References
- What is SafeClaw? — Deny-by-default action gating architecture
- How to Safely Run CrewAI Agents — Another multi-agent framework with per-agent policies
- How to Safely Run LangChain Agents — AutoGen can use LangChain tools
- How to Safely Use OpenAI Agents — OpenAI powers many AutoGen configurations
- How to Safely Run Autonomous Coding Agents — General patterns for autonomous code execution safety
Try SafeClaw
Action-level gating for AI agents. Set it up in your browser in 60 seconds.
$ npx @authensor/safeclaw