How to Add Safety Gating to OpenAI Assistants API
SafeClaw by Authensor enforces deny-by-default policies on every function call and tool action in the OpenAI Assistants API, intercepting requires_action run steps before your code executes them. The Assistants API manages conversation state, threads, and run execution server-side — but your application still handles function call execution, and SafeClaw gates that critical handoff.
How OpenAI Assistants Tool Execution Works
The Assistants API uses a run-based model: you create an assistant with tools (function definitions, code interpreter, file search), add messages to a thread, and create a run. When the assistant needs to call a function, the run enters a requires_action status with a list of tool_calls. Your application executes those function calls and submits the outputs back. Code interpreter and file search run server-side, but function calls are client-side — and that's where SafeClaw intercepts.
Assistant Run → requires_action → tool_calls[] → [SafeClaw Policy Check] → Submit outputs or Deny
Quick Start
npx @authensor/safeclaw
Creates a safeclaw.yaml in your project. SafeClaw maps the Assistants API's function names directly to policy rules.
Step 1: Define Policies for Assistant Functions
# safeclaw.yaml
version: 1
default: deny
policies:
- name: "assistant-data-functions"
description: "Control data retrieval functions"
actions:
- tool: "get_customer"
effect: allow
- tool: "search_orders"
effect: allow
- tool: "get_analytics"
effect: allow
constraints:
date_range_max_days: 90
- name: "assistant-action-functions"
description: "Control state-changing functions"
actions:
- tool: "create_ticket"
effect: allow
- tool: "update_order_status"
effect: allow
constraints:
allowed_statuses: "processing|shipped|delivered"
- tool: "refund_order"
effect: allow
constraints:
max_amount: 500
- tool: "delete_customer"
effect: deny
- tool: "modify_pricing"
effect: deny
- name: "assistant-builtin-tools"
description: "Control built-in assistant tools"
actions:
- tool: "code_interpreter"
effect: allow
- tool: "file_search"
effect: allow
Step 2: Gate the requires_action Handler
import OpenAI from "openai";
import { SafeClaw } from "@authensor/safeclaw";
const openai = new OpenAI();
const safeclaw = new SafeClaw("./safeclaw.yaml");
async function handleRun(threadId: string, runId: string) {
let run = await openai.beta.threads.runs.retrieve(threadId, runId);
while (run.status === "requires_action") {
const toolCalls = run.required_action!.submit_tool_outputs.tool_calls;
const toolOutputs = await Promise.all(
toolCalls.map(async (toolCall) => {
const args = JSON.parse(toolCall.function.arguments);
const decision = safeclaw.evaluate(toolCall.function.name, args);
let output: string;
if (decision.allowed) {
output = JSON.stringify(await executeTool(toolCall.function.name, args));
} else {
output = JSON.stringify({
error: Action denied by SafeClaw: ${decision.reason},
});
}
return { tool_call_id: toolCall.id, output };
})
);
run = await openai.beta.threads.runs.submitToolOutputsAndPoll(
threadId,
runId,
{ tool_outputs: toolOutputs }
);
}
return run;
}
Step 3: Handle Streaming Runs
For streaming assistant runs, SafeClaw integrates with the event handler:
const stream = openai.beta.threads.runs.stream(threadId, {
assistant_id: assistantId,
});
stream.on("event", async (event) => {
if (
event.event === "thread.run.requires_action" &&
event.data.required_action?.type === "submit_tool_outputs"
) {
const toolCalls = event.data.required_action.submit_tool_outputs.tool_calls;
const outputs = [];
for (const call of toolCalls) {
const args = JSON.parse(call.function.arguments);
const decision = safeclaw.evaluate(call.function.name, args);
outputs.push({
tool_call_id: call.id,
output: decision.allowed
? JSON.stringify(await executeTool(call.function.name, args))
: JSON.stringify({ error: Denied: ${decision.reason} }),
});
}
await openai.beta.threads.runs.submitToolOutputs(
event.data.thread_id,
event.data.id,
{ tool_outputs: outputs }
);
}
});
Step 4: Per-Assistant Policies
Different assistants serve different purposes. Define policies per assistant:
policies:
- name: "support-assistant"
assistant_id: "asst_abc123"
actions:
- tool: "get_customer"
effect: allow
- tool: "create_ticket"
effect: allow
- tool: "refund_order"
effect: allow
constraints:
max_amount: 100
- name: "admin-assistant"
assistant_id: "asst_def456"
actions:
- tool: "get_customer"
effect: allow
- tool: "refund_order"
effect: allow
constraints:
max_amount: 5000
- tool: "update_order_status"
effect: allow
Step 5: Audit Assistant Runs
npx @authensor/safeclaw audit --last 100 --filter assistant=asst_abc123
Every function call across all assistant runs is logged with the thread ID, run ID, function name, arguments, and decision — providing full traceability for customer-facing AI assistants.
Why SafeClaw
- 446 tests covering policy evaluation, edge cases, and audit integrity
- Deny-by-default — unlisted functions return denial to the assistant
- Sub-millisecond evaluation — no delay in the run polling loop
- Hash-chained audit log — tamper-evident records across all assistant threads
- Works with Claude AND OpenAI — same policy file, applicable across providers
Related Pages
- How to Secure Your OpenAI GPT Agent
- How to Secure Microsoft Semantic Kernel Agents
- How to Secure Vercel AI SDK Tool Calls
- How to Add Safety Gating to LangChain Agents
Try SafeClaw
Action-level gating for AI agents. Set it up in your browser in 60 seconds.
$ npx @authensor/safeclaw