How to Add Safety Controls to AutoGen Agents
SafeClaw by Authensor enforces deny-by-default policies on every tool call and code execution in Microsoft AutoGen's multi-agent conversations, intercepting actions before they reach your system. AutoGen agents communicate and delegate autonomously — SafeClaw ensures that every executable action passes through your YAML policy, regardless of which agent initiates it.
How AutoGen Tool Execution Works
AutoGen uses a conversational agent architecture where agents exchange messages and can call functions or execute code. The key execution points are: AssistantAgent proposing function calls, UserProxyAgent executing code blocks, and registered functions being invoked via register_for_execution. AutoGen's code execution is particularly risky because it runs generated Python code in a local or Docker environment with full system access.
AutoGen Agent → Function Call / Code Block → [SafeClaw Policy Check] → Execute or Deny
Quick Start
npx @authensor/safeclaw
Creates a safeclaw.yaml in your project. SafeClaw maps to AutoGen's function registration and code execution interfaces.
Step 1: Define AutoGen-Specific Policies
# safeclaw.yaml
version: 1
default: deny
policies:
- name: "autogen-functions"
description: "Control registered function calls"
actions:
- tool: "search_web"
effect: allow
- tool: "query_database"
effect: allow
constraints:
operation: "SELECT"
- tool: "send_email"
effect: deny
- tool: "write_file"
effect: allow
constraints:
path_pattern: "output/**"
- name: "autogen-code-execution"
description: "Control code execution safety"
actions:
- tool: "code_execution"
effect: allow
constraints:
language: "python"
blocked_imports: ["os", "subprocess", "shutil", "socket", "ctypes"]
timeout_ms: 30000
sandbox: true
- tool: "code_execution"
effect: deny
constraints:
language: "bash|shell"
- name: "autogen-file-access"
description: "Restrict file operations in code"
actions:
- tool: "file_write"
effect: allow
constraints:
path_pattern: "workspace/output/**"
- tool: "file_read"
effect: allow
constraints:
path_pattern: "workspace/data/**"
- tool: "file_delete"
effect: deny
Step 2: Integrate with AutoGen Function Calling
Wrap AutoGen's function registration with SafeClaw:
from autogen import AssistantAgent, UserProxyAgent
from safeclaw import SafeClaw
safeclaw = SafeClaw("./safeclaw.yaml")
assistant = AssistantAgent(
name="assistant",
llm_config={"model": "gpt-4o", "api_key": "..."},
)
user_proxy = UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
code_execution_config={"use_docker": True},
)
Wrap function execution with SafeClaw
def safe_register(func, name, description):
def gated_func(**kwargs):
decision = safeclaw.evaluate(name, kwargs)
if not decision.allowed:
return f"Action denied: {decision.reason}"
return func(**kwargs)
assistant.register_for_llm(name=name, description=description)(gated_func)
user_proxy.register_for_execution(name=name)(gated_func)
Register tools through the safe wrapper
safe_register(search_web, "search_web", "Search the web for information")
safe_register(query_db, "query_database", "Query the SQL database")
Step 3: Gate Code Execution
AutoGen's UserProxyAgent can execute code blocks generated by the assistant. This is the highest-risk surface. SafeClaw can intercept code execution:
from safeclaw import SafeClaw
safeclaw = SafeClaw("./safeclaw.yaml")
class SafeUserProxy(UserProxyAgent):
def execute_code_blocks(self, code_blocks):
"""Override to add SafeClaw gating."""
safe_blocks = []
for lang, code in code_blocks:
decision = safeclaw.evaluate("code_execution", {
"language": lang,
"code": code,
})
if decision.allowed:
safe_blocks.append((lang, code))
else:
print(f"Code block denied: {decision.reason}")
safe_blocks.append((lang, f"print('Blocked: {decision.reason}')"))
return super().execute_code_blocks(safe_blocks)
user_proxy = SafeUserProxy(
name="user_proxy",
human_input_mode="NEVER",
code_execution_config={"use_docker": True},
)
Step 4: Multi-Agent Conversation Safety
In AutoGen group chats with multiple agents, SafeClaw evaluates each agent's actions independently:
from autogen import GroupChat, GroupChatManager
researcher = AssistantAgent(name="researcher", ...)
coder = AssistantAgent(name="coder", ...)
executor = SafeUserProxy(name="executor", ...)
group_chat = GroupChat(
agents=[researcher, coder, executor],
messages=[],
max_round=20,
)
manager = GroupChatManager(groupchat=group_chat, llm_config=llm_config)
SafeClaw evaluates every tool call and code execution
regardless of which agent initiates it
executor.initiate_chat(manager, message="Analyze the Q4 data and create a report")
Step 5: Review Multi-Agent Audit Trail
npx @authensor/safeclaw audit --last 100
The hash-chained log shows the full conversation flow — which agent proposed each action, which function was called, what code was generated, and whether it was allowed or denied.
Why SafeClaw
- 446 tests covering policy evaluation, edge cases, and audit integrity
- Deny-by-default — critical for AutoGen's autonomous code execution
- Sub-millisecond evaluation — no delay in multi-agent conversation flow
- Hash-chained audit log — tamper-evident trace of every agent action
- Works with Claude AND OpenAI — supports any LLM backend AutoGen connects to
Related Pages
- How to Secure CrewAI Multi-Agent Systems
- How to Add Safety Gating to LangChain Agents
- How to Secure Microsoft Semantic Kernel Agents
- How to Secure Your OpenAI GPT Agent
Try SafeClaw
Action-level gating for AI agents. Set it up in your browser in 60 seconds.
$ npx @authensor/safeclaw