AI Agent Code Review Safety Checklist
When reviewing code that involves AI agents, use this checklist to verify that safety controls are properly implemented. SafeClaw by Authensor provides the deny-by-default gating layer and hash-chained audit trail that should be present in every AI agent codebase. If SafeClaw is not yet installed, add it with npx @authensor/safeclaw before approving the PR.
Policy File Review
- ✅ 1. SafeClaw policy file exists in the repository. Confirm that
safeclaw.config.yaml(or equivalent) is committed and not in.gitignore.
- ✅ 2.
defaultActionisdeny. Reject any PR that setsdefaultAction: allowin a production policy.
- ✅ 3. Every allow rule has a scoped path/command/domain. No open-ended wildcard rules that grant blanket permissions.
# GOOD: scoped
- action: file.write
path: "/app/output/**"
decision: allow
BAD: unscoped
- action: file.write
path: "/**"
decision: allow
- ✅ 4. New allow rules have documented reasons. Each rule includes a
reasonfield or PR comment explaining why the permission is needed.
- ✅ 5. No rules were removed without justification. Deleting deny rules or removing escalation requirements must be explicitly justified.
- ✅ 6. High-risk actions use
escalate, notallow. Database operations, production deployments, credential access, and config modifications should require human approval.
Agent Code Review
- ✅ 7. SafeClaw is imported and initialized before agent execution. The gating layer must be active before the agent's first action. Late initialization creates an ungated window.
- ✅ 8. No bypass mechanisms exist. Search for code that disables SafeClaw, catches and ignores gating denials, or wraps actions to avoid policy evaluation.
- ✅ 9. Error handling respects gating decisions. When SafeClaw denies an action, the agent must not retry with different parameters to circumvent the denial.
- ✅ 10. Action descriptions are accurate. The action type and parameters passed to SafeClaw must match the actual operation. Mislabeling a
file.writeas afile.readbypasses the policy.
- ✅ 11. Dynamic actions are gated. If the agent constructs actions dynamically (from LLM output or user input), those actions still pass through SafeClaw's policy engine.
Credential and Secret Safety
- ✅ 12. No hardcoded credentials in agent code. API keys, tokens, and passwords must come from environment variables or a secrets manager.
- ✅ 13. Credential files are denied in the policy. Verify deny rules exist for
.env,.ssh/,.aws/, and similar credential paths.
- ✅ 14. Agent does not log sensitive data. Audit trail entries should not contain full API keys, passwords, or PII. Sanitize action parameters before logging.
Test Coverage
- ✅ 15. Safety-specific tests exist. The test suite includes cases that verify denied actions are actually blocked.
- ✅ 16. Tests cover policy edge cases. Path boundary conditions (e.g.,
/app/outputvs/app/output-malicious), command variations, and glob pattern matching are tested.
- ✅ 17. Tests run in CI. SafeClaw policy tests execute in the CI pipeline and block merges on failure.
npx @authensor/safeclaw --test
Audit Trail Verification
- ✅ 18. Audit trail initialization is present. The hash-chained audit log starts recording before the first agent action.
- ✅ 19. Audit log is not written to a world-readable location. Verify the log file permissions restrict access to authorized users.
- ✅ 20. Audit log rotation is configured. For long-running agents, ensure logs are rotated without breaking the hash chain.
Dependency Review
- ✅ 21. No new dependencies bypass SafeClaw. New packages that interact with the filesystem, network, or shell must have their actions routed through SafeClaw's policy engine.
- ✅ 22. SafeClaw version is pinned. The installed version of SafeClaw is pinned to avoid unexpected behavior from automatic updates.
Cross-References
- AI Agent Security Checklist 2026
- Code Review AI Safety Workflow
- Pre-Commit Hooks for AI Safety
- Policy Rule Syntax Reference
Try SafeClaw
Action-level gating for AI agents. Set it up in your browser in 60 seconds.
$ npx @authensor/safeclaw