What it does
Most AI safety checks focus on what a model says. Agent Action Gate focuses on what an AI agent is about to do. Before an agent sends an email, modifies data, calls an API, runs a terminal command, deletes a file, publishes content, or exposes information, the proposed action is evaluated by a small runtime gate.
The gate receives a proposed action, checks it against detector categories, and returns a route that the surrounding workflow can enforce before any tool execution happens.
Why it exists
AI agents can act through tools, APIs, files, email systems, databases, and automation platforms. The risk is no longer only bad text output; it is bad action execution.
Agent Action Gate gives developers a practical place to ask whether a proposed action still fits the user's request, uses the right tool, targets the right object, stays within authorized scope, and has the approval needed for sensitive, irreversible, production, or command-capable work.
The four decisions
| Decision | Route |
|---|---|
allow |
The action appears low risk and can proceed. |
require_approval |
The action needs explicit human approval before execution. |
revise_action |
The action is fixable, but should be changed before execution. |
block |
Action should not execute. |
Detector categories
| Detector | Risk pattern |
|---|---|
wrong_target |
Action points at the wrong person, file, endpoint, record, or resource. |
unauthorized_scope |
Action exceeds the user's request. |
missing_approval |
Action needs approval but none is recorded. |
irreversible_action |
Action is destructive, costly, or hard to undo. |
sensitive_data_exposure |
Action risks exposing sensitive data. |
tool_mismatch |
Action uses the wrong tool or operation. |
objective_drift |
Action no longer serves the original task objective. |
unauthorized_cyber_scope |
Command-capable action targets systems outside the authorized context. |
credential_access |
Action accesses secrets, tokens, private keys, or credential-like material. |
data_exfiltration |
Action dumps, archives, uploads, posts, or transfers data in a suspicious way. |
privilege_escalation |
Action changes users, roles, permissions, root access, or admin capabilities. |
supply_chain_modification |
Action modifies CI/CD, dependencies, packages, deployment, or build-chain configuration. |
destructive_cyber_action |
Action matches destructive command or infrastructure patterns. |
unapproved_command_execution |
Terminal-like command execution is proposed without recorded user approval. |
Decision logging
v0.3.0 writes append-only JSONL decision receipts for successful POST /evaluate calls. Logs are local by default at logs/action-gate-decisions.jsonl.
The receipt records the decision, risk level, primary issue, confidence, proposed tool, action type, target, approval state, environment, recommended action, evidence, triggered detectors, and a payload summary.
Payloads are summarized and redacted before logging. Raw payloads are not written to the decision log, and credential-like values such as secrets, tokens, passwords, private keys, authorization headers, cookies, SSH keys, and long credential-like strings are redacted.
n8n demo workflows
The repository includes two importable n8n demo workflows: examples/n8n-agent-action-gate-demo.json and examples/n8n-agent-action-gate-defensive-demo.json.
The demos show Agent Action Gate sitting between an AI or automation agent and tool execution. They route allow to Continue Action, require_approval to Require Human Approval, revise_action to Revise Proposed Action, and block to Block Action.
The defensive demo exercises the pre-execution review layer for command-capable actions outside authorized scope and expects a block decision.
Validation status
| Check | Status |
|---|---|
| TypeScript compile | Passing |
| Baseline evals | 19/19 passing |
GET /health |
Working |
POST /evaluate |
Working |
| n8n demo workflows | Included |
| Decision logging smoke test | Included |
Compliance note
Agent Action Gate can support human-approval workflows for AI agent actions, especially when actions are external-facing, irreversible, sensitive, command-capable, or broader than requested. It is not legal advice and does not guarantee compliance with any law or framework.
Research lineage
Agent Action Gate is a pre-execution action-control layer in the broader Aletheon alignment architecture. It complements the post-output behavioral drift-detection work in the broader Alignment Theory research stack.
Alignment Theory provides the research framework behind the structural logic. The implementation lives in the Agent Action Gate GitHub repository.
Status
v0.3.0. This is an open-source implementation of a pre-execution control boundary, not a production-hardened enterprise platform.
- TypeScript gate engine
- Local HTTP API
- 19/19 evals passing
- JSONL decision receipts
- Two n8n demo workflows
- Cyber-capable agent protection detectors
- MIT license
- GitHub release v0.3.0