2025-12-29 · Authensor

How to Add Safety Gating to OpenAI Assistants API

SafeClaw by Authensor enforces deny-by-default policies on every function call and tool action in the OpenAI Assistants API, intercepting requires_action run steps before your code executes them. The Assistants API manages conversation state, threads, and run execution server-side — but your application still handles function call execution, and SafeClaw gates that critical handoff.

How OpenAI Assistants Tool Execution Works

The Assistants API uses a run-based model: you create an assistant with tools (function definitions, code interpreter, file search), add messages to a thread, and create a run. When the assistant needs to call a function, the run enters a requires_action status with a list of tool_calls. Your application executes those function calls and submits the outputs back. Code interpreter and file search run server-side, but function calls are client-side — and that's where SafeClaw intercepts.

Assistant Run → requires_action → tool_calls[] → [SafeClaw Policy Check] → Submit outputs or Deny

Quick Start

npx @authensor/safeclaw

Creates a safeclaw.yaml in your project. SafeClaw maps the Assistants API's function names directly to policy rules.

Step 1: Define Policies for Assistant Functions

# safeclaw.yaml
version: 1
default: deny

policies:
- name: "assistant-data-functions"
description: "Control data retrieval functions"
actions:
- tool: "get_customer"
effect: allow
- tool: "search_orders"
effect: allow
- tool: "get_analytics"
effect: allow
constraints:
date_range_max_days: 90

- name: "assistant-action-functions"
description: "Control state-changing functions"
actions:
- tool: "create_ticket"
effect: allow
- tool: "update_order_status"
effect: allow
constraints:
allowed_statuses: "processing|shipped|delivered"
- tool: "refund_order"
effect: allow
constraints:
max_amount: 500
- tool: "delete_customer"
effect: deny
- tool: "modify_pricing"
effect: deny

- name: "assistant-builtin-tools"
description: "Control built-in assistant tools"
actions:
- tool: "code_interpreter"
effect: allow
- tool: "file_search"
effect: allow

Step 2: Gate the requires_action Handler

import OpenAI from "openai";
import { SafeClaw } from "@authensor/safeclaw";

const openai = new OpenAI();
const safeclaw = new SafeClaw("./safeclaw.yaml");

async function handleRun(threadId: string, runId: string) {
let run = await openai.beta.threads.runs.retrieve(threadId, runId);

while (run.status === "requires_action") {
const toolCalls = run.required_action!.submit_tool_outputs.tool_calls;

const toolOutputs = await Promise.all(
toolCalls.map(async (toolCall) => {
const args = JSON.parse(toolCall.function.arguments);
const decision = safeclaw.evaluate(toolCall.function.name, args);

let output: string;
if (decision.allowed) {
output = JSON.stringify(await executeTool(toolCall.function.name, args));
} else {
output = JSON.stringify({
error: Action denied by SafeClaw: ${decision.reason},
});
}

return { tool_call_id: toolCall.id, output };
})
);

run = await openai.beta.threads.runs.submitToolOutputsAndPoll(
threadId,
runId,
{ tool_outputs: toolOutputs }
);
}

return run;
}

Step 3: Handle Streaming Runs

For streaming assistant runs, SafeClaw integrates with the event handler:

const stream = openai.beta.threads.runs.stream(threadId, {
  assistant_id: assistantId,
});

stream.on("event", async (event) => {
if (
event.event === "thread.run.requires_action" &&
event.data.required_action?.type === "submit_tool_outputs"
) {
const toolCalls = event.data.required_action.submit_tool_outputs.tool_calls;
const outputs = [];

for (const call of toolCalls) {
const args = JSON.parse(call.function.arguments);
const decision = safeclaw.evaluate(call.function.name, args);

outputs.push({
tool_call_id: call.id,
output: decision.allowed
? JSON.stringify(await executeTool(call.function.name, args))
: JSON.stringify({ error: Denied: ${decision.reason} }),
});
}

await openai.beta.threads.runs.submitToolOutputs(
event.data.thread_id,
event.data.id,
{ tool_outputs: outputs }
);
}
});

Step 4: Per-Assistant Policies

Different assistants serve different purposes. Define policies per assistant:

policies:
  - name: "support-assistant"
    assistant_id: "asst_abc123"
    actions:
      - tool: "get_customer"
        effect: allow
      - tool: "create_ticket"
        effect: allow
      - tool: "refund_order"
        effect: allow
        constraints:
          max_amount: 100

- name: "admin-assistant"
assistant_id: "asst_def456"
actions:
- tool: "get_customer"
effect: allow
- tool: "refund_order"
effect: allow
constraints:
max_amount: 5000
- tool: "update_order_status"
effect: allow

Step 5: Audit Assistant Runs

npx @authensor/safeclaw audit --last 100 --filter assistant=asst_abc123

Every function call across all assistant runs is logged with the thread ID, run ID, function name, arguments, and decision — providing full traceability for customer-facing AI assistants.

Why SafeClaw

Related Pages

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw