2025-11-17 · Authensor

How to Add Safety Rails to Mistral AI Agents

SafeClaw by Authensor gates every Mistral AI tool call through deny-by-default policies before your application executes it. Mistral's models — including Mistral Large, Codestral, and Mistral Small — support native function calling, and SafeClaw intercepts each tool_calls response to enforce your YAML-defined safety rules in sub-millisecond time.

How Mistral Tool Calling Works

Mistral's chat completion API accepts a tools parameter with function definitions. When the model decides to call a function, it returns a response with tool_calls containing the function name and arguments as a JSON string. Mistral also supports parallel tool calls, where multiple functions are invoked in a single response. The gap between the model's tool call decision and your execution is where SafeClaw enforces policy.

Mistral Response → tool_calls[] → [SafeClaw Policy Check] → Execute or Deny

Mistral's tool calling format is OpenAI-compatible, which means SafeClaw's evaluation works identically — no adapter or translation layer needed.

Quick Start

npx @authensor/safeclaw

Creates a safeclaw.yaml in your project root. Since Mistral uses the OpenAI-compatible format, your existing SafeClaw policies work without modification.

Step 1: Define Mistral-Specific Policies

# safeclaw.yaml version: 1 default: deny policies: - name: "mistral-code-tools" description: "Control Codestral code execution" actions: - tool: "execute_code" effect: allow constraints: language: "python|javascript" timeout_ms: 15000 - tool: "execute_code" effect: deny constraints: language: "bash|shell" - name: "mistral-retrieval" description: "Allow controlled document retrieval" actions: - tool: "search_documents" effect: allow constraints: index: "internal_docs" - tool: "search_documents" effect: deny constraints: index: "external_*"

- name: "mistral-data-ops" description: "Control database operations" actions: - tool: "query_db" effect: allow constraints: operation: "SELECT" tables: "users|orders|products" - tool: "mutate_db" effect: deny

Step 2: Integrate with Mistral's SDK

import { Mistral } from "@mistralai/mistralai";
import { SafeClaw } from "@authensor/safeclaw";

const mistral = new Mistral({ apiKey: process.env.MISTRAL_API_KEY });
const safeclaw = new SafeClaw("./safeclaw.yaml");

const response = await mistral.chat.complete({
  model: "mistral-large-latest",
  tools: [
    {
      type: "function",
      function: {
        name: "query_db",
        description: "Query the database",
        parameters: {
          type: "object",
          properties: { sql: { type: "string" } },
        },
      },
    },
  ],
  messages: [{ role: "user", content: "How many orders shipped this week?" }],
});

const message = response.choices[0].message;

if (message.toolCalls) {
  for (const toolCall of message.toolCalls) {
    const args = JSON.parse(toolCall.function.arguments);
    const decision = safeclaw.evaluate(toolCall.function.name, args);

if (decision.allowed) {
      const result = await executeTool(toolCall.function.name, args);
      // Return tool result to continue conversation
    } else {
      console.log(Blocked: ${toolCall.function.name} — ${decision.reason});
    }
  }
}

Step 3: Handle Mistral's Parallel Function Calls

Mistral models frequently return multiple tool calls in a single response. SafeClaw evaluates each independently, and the audit log records the full parallel batch:

const toolMessages = [];

for (const toolCall of message.toolCalls) {
  const args = JSON.parse(toolCall.function.arguments);
  const decision = safeclaw.evaluate(toolCall.function.name, args);

toolMessages.push({
    role: "tool",
    name: toolCall.function.name,
    content: decision.allowed
      ? JSON.stringify(await executeTool(toolCall.function.name, args))
      : JSON.stringify({ error: Denied: ${decision.reason} }),
    tool_call_id: toolCall.id,
  });
}

// Continue conversation with all tool results
const followUp = await mistral.chat.complete({
  model: "mistral-large-latest",
  tools: tools,
  messages: [...messages, message, ...toolMessages],
});

Step 4: Codestral-Specific Code Safety

When using Codestral for code generation and execution, add targeted policies for code safety:

policies:
  - name: "codestral-safety"
    description: "Safety rails for Codestral code generation"
    actions:
      - tool: "run_generated_code"
        effect: allow
        constraints:
          sandbox: true
          max_memory_mb: 256
          blocked_modules: ["os", "subprocess", "socket", "ctypes"]
      - tool: "write_generated_file"
        effect: allow
        constraints:
          path_pattern: "generated/**"
          max_size_bytes: 100000

Why SafeClaw

446 tests covering policy evaluation, edge cases, and audit integrity
Deny-by-default — unregistered tools are automatically blocked
Sub-millisecond evaluation — no perceptible latency added to Mistral's response pipeline
Hash-chained audit log — cryptographically linked records of every tool call
Works with Claude AND OpenAI — and Mistral, since it uses the OpenAI-compatible format

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw