2025-12-23 · Authensor

How to Add Safety Controls to AutoGen Agents

SafeClaw by Authensor enforces deny-by-default policies on every tool call and code execution in Microsoft AutoGen's multi-agent conversations, intercepting actions before they reach your system. AutoGen agents communicate and delegate autonomously — SafeClaw ensures that every executable action passes through your YAML policy, regardless of which agent initiates it.

How AutoGen Tool Execution Works

AutoGen uses a conversational agent architecture where agents exchange messages and can call functions or execute code. The key execution points are: AssistantAgent proposing function calls, UserProxyAgent executing code blocks, and registered functions being invoked via register_for_execution. AutoGen's code execution is particularly risky because it runs generated Python code in a local or Docker environment with full system access.

AutoGen Agent → Function Call / Code Block → [SafeClaw Policy Check] → Execute or Deny

Quick Start

npx @authensor/safeclaw

Creates a safeclaw.yaml in your project. SafeClaw maps to AutoGen's function registration and code execution interfaces.

Step 1: Define AutoGen-Specific Policies

# safeclaw.yaml
version: 1
default: deny

policies:
- name: "autogen-functions"
description: "Control registered function calls"
actions:
- tool: "search_web"
effect: allow
- tool: "query_database"
effect: allow
constraints:
operation: "SELECT"
- tool: "send_email"
effect: deny
- tool: "write_file"
effect: allow
constraints:
path_pattern: "output/**"

- name: "autogen-code-execution"
description: "Control code execution safety"
actions:
- tool: "code_execution"
effect: allow
constraints:
language: "python"
blocked_imports: ["os", "subprocess", "shutil", "socket", "ctypes"]
timeout_ms: 30000
sandbox: true
- tool: "code_execution"
effect: deny
constraints:
language: "bash|shell"

- name: "autogen-file-access"
description: "Restrict file operations in code"
actions:
- tool: "file_write"
effect: allow
constraints:
path_pattern: "workspace/output/**"
- tool: "file_read"
effect: allow
constraints:
path_pattern: "workspace/data/**"
- tool: "file_delete"
effect: deny

Step 2: Integrate with AutoGen Function Calling

Wrap AutoGen's function registration with SafeClaw:

from autogen import AssistantAgent, UserProxyAgent
from safeclaw import SafeClaw

safeclaw = SafeClaw("./safeclaw.yaml")

assistant = AssistantAgent(
name="assistant",
llm_config={"model": "gpt-4o", "api_key": "..."},
)

user_proxy = UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
code_execution_config={"use_docker": True},
)

Wrap function execution with SafeClaw

def safe_register(func, name, description): def gated_func(**kwargs): decision = safeclaw.evaluate(name, kwargs) if not decision.allowed: return f"Action denied: {decision.reason}" return func(**kwargs)

assistant.register_for_llm(name=name, description=description)(gated_func)
user_proxy.register_for_execution(name=name)(gated_func)

Register tools through the safe wrapper

safe_register(search_web, "search_web", "Search the web for information") safe_register(query_db, "query_database", "Query the SQL database")

Step 3: Gate Code Execution

AutoGen's UserProxyAgent can execute code blocks generated by the assistant. This is the highest-risk surface. SafeClaw can intercept code execution:

from safeclaw import SafeClaw

safeclaw = SafeClaw("./safeclaw.yaml")

class SafeUserProxy(UserProxyAgent):
def execute_code_blocks(self, code_blocks):
"""Override to add SafeClaw gating."""
safe_blocks = []
for lang, code in code_blocks:
decision = safeclaw.evaluate("code_execution", {
"language": lang,
"code": code,
})
if decision.allowed:
safe_blocks.append((lang, code))
else:
print(f"Code block denied: {decision.reason}")
safe_blocks.append((lang, f"print('Blocked: {decision.reason}')"))

return super().execute_code_blocks(safe_blocks)

user_proxy = SafeUserProxy(
name="user_proxy",
human_input_mode="NEVER",
code_execution_config={"use_docker": True},
)

Step 4: Multi-Agent Conversation Safety

In AutoGen group chats with multiple agents, SafeClaw evaluates each agent's actions independently:

from autogen import GroupChat, GroupChatManager

researcher = AssistantAgent(name="researcher", ...)
coder = AssistantAgent(name="coder", ...)
executor = SafeUserProxy(name="executor", ...)

group_chat = GroupChat(
agents=[researcher, coder, executor],
messages=[],
max_round=20,
)

manager = GroupChatManager(groupchat=group_chat, llm_config=llm_config)

SafeClaw evaluates every tool call and code execution

regardless of which agent initiates it

executor.initiate_chat(manager, message="Analyze the Q4 data and create a report")

Step 5: Review Multi-Agent Audit Trail

npx @authensor/safeclaw audit --last 100

The hash-chained log shows the full conversation flow — which agent proposed each action, which function was called, what code was generated, and whether it was allowed or denied.

Why SafeClaw

Related Pages

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw