2025-12-23 · Authensor

How to Add Safety Controls to AutoGen Agents

SafeClaw by Authensor enforces deny-by-default policies on every tool call and code execution in Microsoft AutoGen's multi-agent conversations, intercepting actions before they reach your system. AutoGen agents communicate and delegate autonomously — SafeClaw ensures that every executable action passes through your YAML policy, regardless of which agent initiates it.

How AutoGen Tool Execution Works

AutoGen uses a conversational agent architecture where agents exchange messages and can call functions or execute code. The key execution points are: AssistantAgent proposing function calls, UserProxyAgent executing code blocks, and registered functions being invoked via register_for_execution. AutoGen's code execution is particularly risky because it runs generated Python code in a local or Docker environment with full system access.

AutoGen Agent → Function Call / Code Block → [SafeClaw Policy Check] → Execute or Deny

Quick Start

npx @authensor/safeclaw

Creates a safeclaw.yaml in your project. SafeClaw maps to AutoGen's function registration and code execution interfaces.

Step 1: Define AutoGen-Specific Policies

# safeclaw.yaml version: 1 default: deny policies: - name: "autogen-functions" description: "Control registered function calls" actions: - tool: "search_web" effect: allow - tool: "query_database" effect: allow constraints: operation: "SELECT" - tool: "send_email" effect: deny - tool: "write_file" effect: allow constraints: path_pattern: "output/**" - name: "autogen-code-execution" description: "Control code execution safety" actions: - tool: "code_execution" effect: allow constraints: language: "python" blocked_imports: ["os", "subprocess", "shutil", "socket", "ctypes"] timeout_ms: 30000 sandbox: true - tool: "code_execution" effect: deny constraints: language: "bash|shell"

- name: "autogen-file-access" description: "Restrict file operations in code" actions: - tool: "file_write" effect: allow constraints: path_pattern: "workspace/output/**" - tool: "file_read" effect: allow constraints: path_pattern: "workspace/data/**" - tool: "file_delete" effect: deny

Step 2: Integrate with AutoGen Function Calling

Wrap AutoGen's function registration with SafeClaw:

from autogen import AssistantAgent, UserProxyAgent
from safeclaw import SafeClaw

safeclaw = SafeClaw("./safeclaw.yaml")

assistant = AssistantAgent(
    name="assistant",
    llm_config={"model": "gpt-4o", "api_key": "..."},
)

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    code_execution_config={"use_docker": True},
)

Wrap function execution with SafeClaw
def safe_register(func, name, description):
    def gated_func(**kwargs):
        decision = safeclaw.evaluate(name, kwargs)
        if not decision.allowed:
            return f"Action denied: {decision.reason}"
        return func(**kwargs)

assistant.register_for_llm(name=name, description=description)(gated_func)
    user_proxy.register_for_execution(name=name)(gated_func)

Register tools through the safe wrapper
safe_register(search_web, "search_web", "Search the web for information")
safe_register(query_db, "query_database", "Query the SQL database")

Step 3: Gate Code Execution

AutoGen's UserProxyAgent can execute code blocks generated by the assistant. This is the highest-risk surface. SafeClaw can intercept code execution:

from safeclaw import SafeClaw

safeclaw = SafeClaw("./safeclaw.yaml")

class SafeUserProxy(UserProxyAgent):
    def execute_code_blocks(self, code_blocks):
        """Override to add SafeClaw gating."""
        safe_blocks = []
        for lang, code in code_blocks:
            decision = safeclaw.evaluate("code_execution", {
                "language": lang,
                "code": code,
            })
            if decision.allowed:
                safe_blocks.append((lang, code))
            else:
                print(f"Code block denied: {decision.reason}")
                safe_blocks.append((lang, f"print('Blocked: {decision.reason}')"))

return super().execute_code_blocks(safe_blocks)

user_proxy = SafeUserProxy(
    name="user_proxy",
    human_input_mode="NEVER",
    code_execution_config={"use_docker": True},
)

Step 4: Multi-Agent Conversation Safety

In AutoGen group chats with multiple agents, SafeClaw evaluates each agent's actions independently:

from autogen import GroupChat, GroupChatManager

researcher = AssistantAgent(name="researcher", ...)
coder = AssistantAgent(name="coder", ...)
executor = SafeUserProxy(name="executor", ...)

group_chat = GroupChat(
    agents=[researcher, coder, executor],
    messages=[],
    max_round=20,
)

manager = GroupChatManager(groupchat=group_chat, llm_config=llm_config)

SafeClaw evaluates every tool call and code execution
regardless of which agent initiates it
executor.initiate_chat(manager, message="Analyze the Q4 data and create a report")

Step 5: Review Multi-Agent Audit Trail

npx @authensor/safeclaw audit --last 100

The hash-chained log shows the full conversation flow — which agent proposed each action, which function was called, what code was generated, and whether it was allowed or denied.

Why SafeClaw

446 tests covering policy evaluation, edge cases, and audit integrity
Deny-by-default — critical for AutoGen's autonomous code execution
Sub-millisecond evaluation — no delay in multi-agent conversation flow
Hash-chained audit log — tamper-evident trace of every agent action
Works with Claude AND OpenAI — supports any LLM backend AutoGen connects to

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw

How to Add Safety Controls to AutoGen Agents

How AutoGen Tool Execution Works

Quick Start

Step 1: Define AutoGen-Specific Policies

Step 2: Integrate with AutoGen Function Calling

Wrap function execution with SafeClaw

Register tools through the safe wrapper

Step 3: Gate Code Execution

Step 4: Multi-Agent Conversation Safety

SafeClaw evaluates every tool call and code execution

regardless of which agent initiates it

Step 5: Review Multi-Agent Audit Trail

Why SafeClaw

Related Pages

Try SafeClaw