How to Use SafeClaw with OpenAI
OpenAI's agent ecosystem -- GPT-based agents, the Assistants API, custom GPTs with code execution -- gives AI the ability to act on your system. File writes, shell commands, API calls. These agents are productive, but productivity without constraints is how 1.5 million API keys leaked through Clawdbot in under a month.
SafeClaw provides action-level gating for AI agents, and it integrates directly with OpenAI-based agent frameworks. This guide covers setup, interception mechanics, and concrete policies for GPT agent workflows.
Why OpenAI Agents Need Gating
OpenAI agents operate with function calling and code execution capabilities. When you give an agent tools -- file system access, shell execution, HTTP requests -- it will use them. The agent optimizes for task completion. It does not optimize for security.
Consider a GPT-based coding agent with shell access. You ask it to set up a project. It might:
- Run
npm init(fine) - Install dependencies (fine)
- Create a
.envfile with placeholder API keys (not fine) - Run
curlto test an endpoint (maybe fine, maybe not) - Execute a cleanup script that deletes files outside the project (definitely not fine)
Setting Up SafeClaw for OpenAI
Step 1: Install
npx @authensor/safeclaw
The browser dashboard opens. Create a free account -- no credit card, 7-day renewable keys. Select OpenAI as your agent framework in the setup wizard.
Step 2: Interception Architecture
SafeClaw sits between your OpenAI agent and the system. The flow:
- Your agent (GPT-4, GPT-4o, Assistants API, or custom agent) decides to execute an action.
- The action request is captured: what type (file_write, shell_exec, network), what target (path, command, destination).
- SafeClaw's local engine evaluates the request against your policy. Top-to-bottom rule evaluation, first match wins.
- Allow: the action executes. Deny: the action is blocked. No match: denied (deny-by-default).
- The decision is logged in the tamper-proof audit trail with SHA-256 hash chaining.
Step 3: Agent Identity Configuration
Set up an agent identity for your OpenAI agent:
{
"agentId": "openai-dev-agent",
"framework": "openai"
}
If you run multiple OpenAI agents (one for coding, one for data analysis, one for DevOps), give each a distinct agentId and create separate policies. SafeClaw routes actions to the correct policy based on agent identity.
Example Policies for OpenAI Agents
Policy 1: GPT Coding Agent
A standard policy for a GPT-based agent writing code in a JavaScript/TypeScript project:
{
"name": "gpt-coding-agent",
"agentId": "openai-dev-agent",
"rules": [
{
"action": "file_write",
"effect": "deny",
"pathPattern": "*/.env"
},
{
"action": "file_write",
"effect": "deny",
"pathPattern": "/.ssh/"
},
{
"action": "file_write",
"effect": "deny",
"pathPattern": "/.aws/"
},
{
"action": "file_write",
"effect": "deny",
"pathPattern": "**/.npmrc"
},
{
"action": "file_write",
"effect": "deny",
"pathPattern": "/node_modules/"
},
{
"action": "file_write",
"effect": "allow",
"pathPattern": "/home/user/project/src/**"
},
{
"action": "file_write",
"effect": "allow",
"pathPattern": "/home/user/project/tests/**"
},
{
"action": "file_write",
"effect": "allow",
"pathPattern": "/home/user/project/package.json"
},
{
"action": "shell_exec",
"effect": "deny",
"command": "rm -rf *"
},
{
"action": "shell_exec",
"effect": "deny",
"command": "curl *"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "npm install"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "npm test"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "npm run build"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "node *.js"
},
{
"action": "network",
"effect": "allow",
"destination": "registry.npmjs.org"
},
{
"action": "network",
"effect": "allow",
"destination": "api.openai.com"
}
]
}
Key decisions in this policy:
- Credentials files denied explicitly at the top, before any allow rules.
- node_modules denied. The agent should not be writing directly into node_modules. That is what
npm installis for. - curl denied. GPT agents sometimes use curl for testing. Block it and use explicit network destination rules instead.
- Network limited to npm registry and OpenAI API. No arbitrary outbound connections.
Policy 2: Assistants API with Code Interpreter
OpenAI's Assistants API with Code Interpreter runs Python code in a sandboxed environment. When you extend it with custom tools that access your local system, SafeClaw gates those tools:
{
"name": "assistant-code-interpreter",
"agentId": "openai-assistant",
"rules": [
{
"action": "file_write",
"effect": "allow",
"pathPattern": "/home/user/workspace/output/**"
},
{
"action": "file_write",
"effect": "allow",
"pathPattern": "/home/user/workspace/data/*/.csv"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "python *.py"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "pip install *"
},
{
"action": "network",
"effect": "allow",
"destination": "pypi.org"
},
{
"action": "network",
"effect": "allow",
"destination": "files.pythonhosted.org"
},
{
"action": "network",
"effect": "deny",
"destination": "*"
}
]
}
This assistant can write output files and CSVs, run Python, install packages from PyPI, and nothing else. The explicit deny-all network rule at the bottom reinforces that only PyPI is allowed.
Policy 3: GPT DevOps Agent
A more permissive but still bounded policy for a GPT agent handling DevOps tasks:
{
"name": "gpt-devops-agent",
"agentId": "openai-devops",
"rules": [
{
"action": "file_write",
"effect": "deny",
"pathPattern": "/.ssh/"
},
{
"action": "file_write",
"effect": "deny",
"pathPattern": "**/.aws/credentials"
},
{
"action": "file_write",
"effect": "deny",
"pathPattern": "*/id_rsa"
},
{
"action": "file_write",
"effect": "allow",
"pathPattern": "/home/user/infra/**"
},
{
"action": "file_write",
"effect": "allow",
"pathPattern": "/home/user/project/docker-compose*.yml"
},
{
"action": "file_write",
"effect": "allow",
"pathPattern": "/home/user/project/Dockerfile"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "docker build *"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "docker compose up *"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "docker compose down"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "terraform plan"
},
{
"action": "shell_exec",
"effect": "allow",
"command": "terraform validate"
},
{
"action": "network",
"effect": "allow",
"destination": "registry.hub.docker.com"
},
{
"action": "network",
"effect": "allow",
"destination": "registry.terraform.io"
}
]
}
Notice: terraform apply is intentionally absent. The agent can plan and validate, but applying infrastructure changes requires human approval outside of SafeClaw. This is a deliberate policy choice.
OpenAI-Specific Patterns to Watch
Function calling expansion. GPT-4 and later models aggressively use any tools you provide. If you define a tool that writes files, the model will find reasons to use it. SafeClaw ensures those writes stay within policy.
Multi-step tool chains. OpenAI agents often chain tool calls: write a file, then run it, then read the output. Each step in that chain is independently evaluated by SafeClaw. A denied step stops the chain.
Retry behavior. When an action is denied, some OpenAI agent frameworks will retry with variations. SafeClaw denies each attempt independently. The audit trail records every attempt, giving you visibility into the agent's behavior when blocked.
Token-based output to files. GPT agents sometimes write large outputs to files instead of returning them in the response. If your file_write rules are too narrow, this pattern gets blocked. Check your audit logs for unexpected file_write denials.
Testing with Simulation Mode
Enable simulation mode in the SafeClaw dashboard before enforcing your OpenAI policy. Run your agent through a real task. Review the simulation log:
- "Would deny" entries on legitimate operations: widen your allow rules.
- "Would allow" entries on risky operations: tighten your rules or add explicit denies.
Getting Started
npx @authensor/safeclaw
Free tier. No credit card. 7-day renewable keys. SafeClaw is built on the Authensor framework -- 446 tests, TypeScript strict mode, zero dependencies. 100% open source client. Works with Claude, OpenAI, and LangChain.
Your GPT agents are powerful. Make them accountable. Visit safeclaw.onrender.com or authensor.com for full documentation.
Try SafeClaw
Action-level gating for AI agents. Set it up in your browser in 60 seconds.
$ npx @authensor/safeclaw