2025-12-15 · Authensor

How to Secure Your OpenAI GPT Agent

SafeClaw by Authensor gates every OpenAI tool_calls response before your application executes it, applying deny-by-default YAML policies that block unauthorized actions. Whether you're using GPT-4o, GPT-4.1, or o3, SafeClaw evaluates each function call against your policy file in sub-millisecond time and logs every decision to a hash-chained audit trail.

How OpenAI Tool Calling Works

OpenAI's Chat Completions API uses a tools parameter where you define available functions. When the model decides to call a function, it returns a tool_calls array in the assistant message, each containing a function name and JSON arguments. Your application then executes those functions and sends results back. The vulnerability is clear: between the model's decision and your execution, nothing validates whether that call should be allowed.

GPT Response → tool_calls[] → [SafeClaw Policy Check] → Execute or Deny

Quick Start

npx @authensor/safeclaw

Generates a safeclaw.yaml in your project root. SafeClaw maps directly to OpenAI's function call structure — the tool name matches the function name, and constraints apply to the arguments object.

Step 1: Define Policies for OpenAI Functions

# safeclaw.yaml
version: 1
default: deny

policies:
- name: "openai-database-access"
description: "Control database query functions"
actions:
- tool: "query_database"
effect: allow
constraints:
operation: "SELECT"
- tool: "query_database"
effect: deny
constraints:
operation: "DROP|DELETE|TRUNCATE"

- name: "openai-file-policy"
description: "Restrict file operations"
actions:
- tool: "write_file"
effect: allow
constraints:
path_pattern: "output/**"
- tool: "read_file"
effect: allow
constraints:
path_pattern: "data/**"

- name: "openai-api-calls"
description: "Control external API access"
actions:
- tool: "call_api"
effect: allow
constraints:
url_pattern: "https://api.internal.company.com/**"
- tool: "call_api"
effect: deny

Step 2: Integrate with OpenAI SDK

import OpenAI from "openai";
import { SafeClaw } from "@authensor/safeclaw";

const openai = new OpenAI();
const safeclaw = new SafeClaw("./safeclaw.yaml");

const response = await openai.chat.completions.create({
model: "gpt-4o",
tools: [
{
type: "function",
function: {
name: "query_database",
description: "Run a SQL query",
parameters: { type: "object", properties: { sql: { type: "string" } } },
},
},
],
messages: [{ role: "user", content: "Show me all users who signed up this week" }],
});

const message = response.choices[0].message;

if (message.tool_calls) {
for (const toolCall of message.tool_calls) {
const args = JSON.parse(toolCall.function.arguments);
const decision = safeclaw.evaluate(toolCall.function.name, args);

if (decision.allowed) {
const result = await executeTool(toolCall.function.name, args);
// Push tool result back to messages
} else {
console.log(Blocked: ${toolCall.function.name} — ${decision.reason});
}
}
}

Step 3: Handle Parallel Tool Calls

OpenAI frequently returns multiple tool_calls in a single response. SafeClaw evaluates each independently:

const toolResults = await Promise.all(
  message.tool_calls.map(async (toolCall) => {
    const args = JSON.parse(toolCall.function.arguments);
    const decision = safeclaw.evaluate(toolCall.function.name, args);

return {
tool_call_id: toolCall.id,
role: "tool" as const,
content: decision.allowed
? JSON.stringify(await executeTool(toolCall.function.name, args))
: JSON.stringify({ error: Denied: ${decision.reason} }),
};
})
);

Step 4: Structured Output Safety

When using OpenAI's structured outputs with response_format, SafeClaw can also validate the schema of function arguments against expected types, catching malformed or injection-style arguments before execution.

policies:
  - name: "argument-validation"
    actions:
      - tool: "update_user"
        effect: allow
        constraints:
          required_fields: ["user_id", "field", "value"]
          field_whitelist: ["name", "email", "preferences"]

Why SafeClaw

Related Pages

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw