2025-12-15 · Authensor

How to Secure Your OpenAI GPT Agent

SafeClaw by Authensor gates every OpenAI tool_calls response before your application executes it, applying deny-by-default YAML policies that block unauthorized actions. Whether you're using GPT-4o, GPT-4.1, or o3, SafeClaw evaluates each function call against your policy file in sub-millisecond time and logs every decision to a hash-chained audit trail.

How OpenAI Tool Calling Works

OpenAI's Chat Completions API uses a tools parameter where you define available functions. When the model decides to call a function, it returns a tool_calls array in the assistant message, each containing a function name and JSON arguments. Your application then executes those functions and sends results back. The vulnerability is clear: between the model's decision and your execution, nothing validates whether that call should be allowed.

GPT Response → tool_calls[] → [SafeClaw Policy Check] → Execute or Deny

Quick Start

npx @authensor/safeclaw

Generates a safeclaw.yaml in your project root. SafeClaw maps directly to OpenAI's function call structure — the tool name matches the function name, and constraints apply to the arguments object.

Step 1: Define Policies for OpenAI Functions

# safeclaw.yaml version: 1 default: deny policies: - name: "openai-database-access" description: "Control database query functions" actions: - tool: "query_database" effect: allow constraints: operation: "SELECT" - tool: "query_database" effect: deny constraints: operation: "DROP|DELETE|TRUNCATE" - name: "openai-file-policy" description: "Restrict file operations" actions: - tool: "write_file" effect: allow constraints: path_pattern: "output/**" - tool: "read_file" effect: allow constraints: path_pattern: "data/**"

- name: "openai-api-calls" description: "Control external API access" actions: - tool: "call_api" effect: allow constraints: url_pattern: "https://api.internal.company.com/**" - tool: "call_api" effect: deny

Step 2: Integrate with OpenAI SDK

import OpenAI from "openai";
import { SafeClaw } from "@authensor/safeclaw";

const openai = new OpenAI();
const safeclaw = new SafeClaw("./safeclaw.yaml");

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  tools: [
    {
      type: "function",
      function: {
        name: "query_database",
        description: "Run a SQL query",
        parameters: { type: "object", properties: { sql: { type: "string" } } },
      },
    },
  ],
  messages: [{ role: "user", content: "Show me all users who signed up this week" }],
});

const message = response.choices[0].message;

if (message.tool_calls) {
  for (const toolCall of message.tool_calls) {
    const args = JSON.parse(toolCall.function.arguments);
    const decision = safeclaw.evaluate(toolCall.function.name, args);

if (decision.allowed) {
      const result = await executeTool(toolCall.function.name, args);
      // Push tool result back to messages
    } else {
      console.log(Blocked: ${toolCall.function.name} — ${decision.reason});
    }
  }
}

Step 3: Handle Parallel Tool Calls

OpenAI frequently returns multiple tool_calls in a single response. SafeClaw evaluates each independently:

const toolResults = await Promise.all(
  message.tool_calls.map(async (toolCall) => {
    const args = JSON.parse(toolCall.function.arguments);
    const decision = safeclaw.evaluate(toolCall.function.name, args);

return {
      tool_call_id: toolCall.id,
      role: "tool" as const,
      content: decision.allowed
        ? JSON.stringify(await executeTool(toolCall.function.name, args))
        : JSON.stringify({ error: Denied: ${decision.reason} }),
    };
  })
);

Step 4: Structured Output Safety

When using OpenAI's structured outputs with response_format, SafeClaw can also validate the schema of function arguments against expected types, catching malformed or injection-style arguments before execution.

policies:
  - name: "argument-validation"
    actions:
      - tool: "update_user"
        effect: allow
        constraints:
          required_fields: ["user_id", "field", "value"]
          field_whitelist: ["name", "email", "preferences"]

Why SafeClaw

446 tests covering policy evaluation, edge cases, and audit integrity
Deny-by-default — if a function isn't explicitly allowed, it's blocked
Sub-millisecond evaluation — no perceptible latency in your OpenAI tool loop
Hash-chained audit log — tamper-evident record of every function call evaluated
Works with Claude AND OpenAI — same policy file, swap LLM providers freely

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw