2026-02-09 · Authensor

Safety Controls for AI Test Automation Agents

AI test automation agents — systems that generate test cases, execute test suites, analyze coverage, and modify test infrastructure — require careful safety boundaries because they straddle the line between code generation and code execution, and a test agent with unrestricted access can accidentally run destructive commands, overwrite production configuration, or leak secrets through test output. SafeClaw by Authensor provides test-specific safety controls: test-directory scoping, command whitelisting limited to test runners, and environment isolation that prevents test agents from affecting production code. Install with npx @authensor/safeclaw.

Test Agent Threat Model

Test automation agents need enough access to write tests and run them, but not so much that they can damage the codebase:

  ┌──────────────────────────────────────────────────┐
  │  TEST AGENT ACTIONS                               │
  │                                                    │
  │  SAFE:                                             │
  │  ├─ Write test files to tests/                     │
  │  ├─ Read source files to understand code           │
  │  ├─ Run test suite (npm test, pytest, jest)        │
  │  └─ Read test output and coverage reports          │
  │                                                    │
  │  DANGEROUS:                                        │
  │  ├─ Modify source files (not its job)              │
  │  ├─ Install new packages (supply chain risk)       │
  │  ├─ Execute arbitrary shell commands               │
  │  ├─ Access .env or secret files                    │
  │  └─ Push to remote repository                      │
  └──────────────────────────────────────────────────┘

SafeClaw Policy for Test Automation Agents

# safeclaw-test-agent.yaml version: "1.0" agent: test-automation rules: # === FILE READS (broad, needed to understand code) === - action: file_read path: "src/**" decision: allow - action: file_read path: "tests/**" decision: allow - action: file_read path: "package.json" decision: allow - action: file_read path: "tsconfig.json" decision: allow - action: file_read path: "jest.config.*" decision: allow - action: file_read path: "pytest.ini" decision: allow - action: file_read path: "*/.env" decision: deny - action: file_read path: "*/secret*" decision: deny - action: file_read decision: deny # === FILE WRITES (test directory ONLY) === - action: file_write path: "tests/**" decision: allow - action: file_write path: "/__tests__/" decision: allow - action: file_write path: "*/.test.*" decision: allow - action: file_write path: "*/.spec.*" decision: allow - action: file_write decision: deny # Cannot modify source code # === SHELL COMMANDS (test runners only) === - action: shell_execute command: "npm test**" decision: allow - action: shell_execute command: "npx jest**" decision: allow - action: shell_execute command: "npx jest --coverage**" decision: allow - action: shell_execute command: "pytest**" decision: allow - action: shell_execute command: "python -m pytest**" decision: allow - action: shell_execute command: "npx tsc --noEmit" decision: allow - action: shell_execute decision: deny # === NETWORK === - action: network_request decision: deny

# === FILE DELETION === - action: file_delete decision: deny

Why Test Agents Cannot Modify Source Code

A common question: shouldn't the test agent fix the code if a test reveals a bug? No. The test agent's job is to write tests and report failures. Code fixes should be handled by a separate agent (a code generation agent) or a human, each with their own SafeClaw policy. This separation of concerns prevents a single agent from having both write-source and execute-shell permissions, which is a privilege escalation risk.

  Test Agent (this policy)        Code Agent (separate policy)
  ├─ Read src/           ✓        ├─ Read src/           ✓
  ├─ Write tests/        ✓        ├─ Write src/          ✓
  ├─ Run tests           ✓        ├─ Run tests           ✓
  ├─ Write src/          ✗        ├─ Write tests/        ✗
  └─ Install packages    ✗        └─ Install packages    ✗

Test Execution Limits

Prevent test agents from entering infinite loops or consuming excessive resources:

limits:
  max_test_executions_per_session: 20
  max_test_files_written: 50
  max_shell_execution_time: "5m"  # Kill any test run > 5 min
  max_output_size: "10MB"
  on_limit_exceeded: halt_and_log

Coverage-Aware Gating

Advanced test agents that aim for coverage targets can be given coverage-specific permissions:

rules:
  - action: shell_execute
    command: "npx jest --coverage --coverageReporters=json-summary"
    decision: allow
  - action: file_read
    path: "coverage/**"
    decision: allow

This allows the agent to read coverage reports and iterate on tests, while still being denied from modifying source code or executing arbitrary commands. SafeClaw's 446-test suite covers all these scenarios, and the tool works with Claude and OpenAI under MIT license.

Cross-References

Try SafeClaw

Action-level gating for AI agents. Set it up in your browser in 60 seconds.

$ npx @authensor/safeclaw