2025-12-29 · Authensor

How to Add Safety Gating to OpenAI Assistants API

SafeClaw by Authensor enforces deny-by-default policies on every function call and tool action in the OpenAI Assistants API, intercepting requires_action run steps before your code executes them. The Assistants API manages conversation state, threads, and run execution server-side — but your application still handles function call execution, and SafeClaw gates that critical handoff.

How OpenAI Assistants Tool Execution Works

The Assistants API uses a run-based model: you create an assistant with tools (function definitions, code interpreter, file search), add messages to a thread, and create a run. When the assistant needs to call a function, the run enters a requires_action status with a list of tool_calls. Your application executes those function calls and submits the outputs back. Code interpreter and file search run server-side, but function calls are client-side — and that's where SafeClaw intercepts.

Assistant Run → requires_action → tool_calls[] → [SafeClaw Policy Check] → Submit outputs or Deny

Quick Start

npx @authensor/safeclaw

Creates a safeclaw.yaml in your project. SafeClaw maps the Assistants API's function names directly to policy rules.

Step 1: Define Policies for Assistant Functions

# safeclaw.yaml version: 1 default: deny policies: - name: "assistant-data-functions" description: "Control data retrieval functions" actions: - tool: "get_customer" effect: allow - tool: "search_orders" effect: allow - tool: "get_analytics" effect: allow constraints: date_range_max_days: 90 - name: "assistant-action-functions" description: "Control state-changing functions" actions: - tool: "create_ticket" effect: allow - tool: "update_order_status" effect: allow constraints: allowed_statuses: "processing|shipped|delivered" - tool: "refund_order" effect: allow constraints: max_amount: 500 - tool: "delete_customer" effect: deny - tool: "modify_pricing" effect: deny

- name: "assistant-builtin-tools" description: "Control built-in assistant tools" actions: - tool: "code_interpreter" effect: allow - tool: "file_search" effect: allow

Step 2: Gate the requires_action Handler

import OpenAI from "openai";
import { SafeClaw } from "@authensor/safeclaw";

const openai = new OpenAI();
const safeclaw = new SafeClaw("./safeclaw.yaml");

async function handleRun(threadId: string, runId: string) {
  let run = await openai.beta.threads.runs.retrieve(threadId, runId);

while (run.status === "requires_action") {
    const toolCalls = run.required_action!.submit_tool_outputs.tool_calls;

const toolOutputs = await Promise.all(
      toolCalls.map(async (toolCall) => {
        const args = JSON.parse(toolCall.function.arguments);
        const decision = safeclaw.evaluate(toolCall.function.name, args);

let output: string;
        if (decision.allowed) {
          output = JSON.stringify(await executeTool(toolCall.function.name, args));
        } else {
          output = JSON.stringify({
            error: Action denied by SafeClaw: ${decision.reason},
          });
        }

return { tool_call_id: toolCall.id, output };
      })
    );

run = await openai.beta.threads.runs.submitToolOutputsAndPoll(
      threadId,
      runId,
      { tool_outputs: toolOutputs }
    );
  }

return run;
}

Step 3: Handle Streaming Runs

For streaming assistant runs, SafeClaw integrates with the event handler:

const stream = openai.beta.threads.runs.stream(threadId, {
  assistant_id: assistantId,
});

stream.on("event", async (event) => {
  if (
    event.event === "thread.run.requires_action" &&
    event.data.required_action?.type === "submit_tool_outputs"
  ) {
    const toolCalls = event.data.required_action.submit_tool_outputs.tool_calls;
    const outputs = [];

for (const call of toolCalls) {
      const args = JSON.parse(call.function.arguments);
      const decision = safeclaw.evaluate(call.function.name, args);

outputs.push({
        tool_call_id: call.id,
        output: decision.allowed
          ? JSON.stringify(await executeTool(call.function.name, args))
          : JSON.stringify({ error: Denied: ${decision.reason} }),
      });
    }

await openai.beta.threads.runs.submitToolOutputs(
      event.data.thread_id,
      event.data.id,
      { tool_outputs: outputs }
    );
  }
});

Step 4: Per-Assistant Policies

Different assistants serve different purposes. Define policies per assistant:

policies: - name: "support-assistant" assistant_id: "asst_abc123" actions: - tool: "get_customer" effect: allow - tool: "create_ticket" effect: allow - tool: "refund_order" effect: allow constraints: max_amount: 100

- name: "admin-assistant" assistant_id: "asst_def456" actions: - tool: "get_customer" effect: allow - tool: "refund_order" effect: allow constraints: max_amount: 5000 - tool: "update_order_status" effect: allow

Step 5: Audit Assistant Runs

npx @authensor/safeclaw audit --last 100 --filter assistant=asst_abc123

Every function call across all assistant runs is logged with the thread ID, run ID, function name, arguments, and decision — providing full traceability for customer-facing AI assistants.

Why SafeClaw

446 tests covering policy evaluation, edge cases, and audit integrity
Deny-by-default — unlisted functions return denial to the assistant
Sub-millisecond evaluation — no delay in the run polling loop
Hash-chained audit log — tamper-evident records across all assistant threads
Works with Claude AND OpenAI — same policy file, applicable across providers

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw