Airlock - Authorization Layer for AI Agents

The Problem

Agents have the keys.
But there's no lock.

Prompt Injection

Untrusted content — web pages, emails, documents, tool descriptions — can instruct your agent to take unauthorized actions with its available tools.

Agent Error

The agent misinterprets intent, hallucinates parameters, or calls the wrong tool — sending a message to the wrong person, deleting the wrong file, pushing to the wrong branch.

All or Nothing Access

Today, agents either have full access to a tool or none at all. There's no way to allow reading but gate writing, or auto-approve safe commands while requiring approval for dangerous ones.

The Solution

One gateway. Total control.

Airlock speaks the Model Context Protocol. Any MCP client connects, receives a filtered tool surface based on its agent profile, and all permission logic applies identically — regardless of the client.

Per-Agent Allowlists

Define exactly which tools each agent can see. Tools not in the allowlist are completely hidden — the agent doesn't even know they exist. Supports wildcards like github/*.

Human-in-the-Loop Approval

Flag sensitive tools for manual approval. When an agent tries to create a PR or push code, you get a notification on Telegram, Slack, or your terminal. Approve or deny with a single reply.

Full Audit Trail

Every tool call is logged to SQLite — agent, tool, parameters, decision, latency. Redact sensitive fields automatically. Query your logs with a built-in management API.

Built-in HTTP & Exec Tools

Make HTTP requests and run shell commands through the gateway. Domain allowlists, blocked hosts, and command-level policies (allow/deny/require approval) — all enforced at the infrastructure level.

Auto-Discovery

Generate Airlock configs automatically from CLI --help output, Fig autocomplete specs, shell completions, or OpenAPI 3.x specs. No YAML by hand.

Composable Profiles

Define reusable permission sets like readonly and developer, then compose them with extends: [readonly, developer]. Profiles merge with the same deny > ask > allow precedence.

Sandbox Presets & Tool Variants

Expose multiple names for the same tool with different security envelopes. Auto-approve the sandboxed variant, gate the full-power version behind approval. Reusable presets reduce config repetition.

Middleware Pipeline

Composable middleware on every tool call: prompt injection detection, canary tokens, PII classification, output scanning, rate limiting, schema validation, and LLM-powered output summarization.

Client Agnostic

Works with any MCP client — Claude Code, Cursor, OpenClaw, or your own. Connects over stdio, SSE, or streamable HTTP. The agent just sees a normal MCP tool server.

One Config File

Everything lives in airlock.yaml. Define your MCPs, agent profiles, approval rules, and exec policies in one place. Hot-reloads on save — no restarts needed.

How It Works

Three layers of defense

1

Visibility Control

Each agent gets a tailored tool manifest. Your coding agent sees filesystem and git tools. Your monitoring agent sees Sentry and PostHog. Your research agent sees HTTP GET and nothing else. Tools outside the allowlist are invisible.

2

Human Approval Gate

Sensitive actions pause and wait. The agent calls github/create_pr and the gateway holds the connection, sends you a notification with the details, and waits for your approve or deny. The agent just experiences a slow tool call — no special client support needed.

3

Full Audit Trail

Every call — allowed, approved, denied, timed out, blocked — is recorded with timestamps, parameters, decisions, and latency. Sensitive fields are automatically redacted. Query your audit log to understand exactly what your agents have been doing.

Architecture

Sits between agents and tools

Human-in-the-Loop

Approve from anywhere

When an agent hits a gated action, you get a notification with all the context you need. Approve or deny in seconds — from Telegram, Slack, or your terminal.

APPROVE? [A1B2C3]

Agent: helena

Tool: github/create_pr

repo: acme/backend
title: "Fix auth token refresh"
body: "Fixes #42..."

approve A1B2C3 / deny A1B2C3

approve A1B2C3

Terminal

APPROVE? [X9Y8Z7]

Agent:   claude-code
Tool:    exec/run
Command: git push origin feature/auth-fix

> approve X9Y8Z7
Approved. Executing...

Dashboard

Localhost web UI with live updates

Airlock Approvals       http://localhost:4112

exec/run  claude-code
  git push origin main
  [A]pprove  [D]eny  keyboard shortcuts

Syntax highlighting • Browser notifications
Sound alerts • Full arg inspection modal

Batched

APPROVE? (3 actions)

1. [K3L4] http/post → api.notion.com

2. [M5N6] slack/send_message → #general

3. [P7Q8] exec/run → git tag v1.2.3

approve all / deny all

approve all

Companion App

Approve from your menu bar

A native macOS app that lives in your menu bar. Get notifications, review tool calls, and approve or deny — without leaving what you're doing.

Native notifications with Approve / Deny actions

Live countdown timers for pending requests

Approval history with full tool call details

Auto-reconnects to the dashboard provider

Download for macOS

or via Homebrew:

brew install airlock-dev/tap/airlock-companion

Requires macOS 14 (Sonoma) or later. Signed and notarized.

Auto-Discovery

Don't write YAML by hand

Point Airlock at your existing tools and it generates configs automatically. Three discovery strategies for CLIs. Full OpenAPI support for REST APIs. Interactive TUIs for fine-tuning.

CLI Discovery

Parse --help output, shell completion scripts (Cobra, Click, Clap), or Fig specs. Airlock picks the richest source available.

airlock discover cli kubectl --fig
airlock discover cli docker --max-depth 2
airlock discover cli git --include status,push,log

API Discovery

Feed in any OpenAPI 3.x spec — local file or URL — and every endpoint becomes a namespaced MCP tool with the same permission engine.

airlock discover api ./petstore.json
airlock discover api https://api.example.com/openapi.json \
  --include "GET *" --exclude "DELETE *"

Interactive TUIs

airlock configure-cli lets you browse subcommand trees, toggle commands, and export YAML interactively. configure-agent assigns allow/ask/deny per tool with keyboard shortcuts.

Middleware Pipeline

Every tool call passes through the gauntlet

A composable middleware stack runs on every tool invocation — before and after execution. Each layer can inspect, block, transform, or annotate the call.

Prompt Injection Detection

Scans inbound tool arguments for injection patterns. Regex-based by default, or plug in a DeBERTa ML classifier at a configurable confidence threshold.

Output Injection Scanning

Scans tool responses for prompt injection attempts before they reach the agent. Can flag or automatically mangle suspicious content.

Canary Tokens

Injects invisible markers into tool outputs. If a marker appears in a subsequent tool's input, Airlock catches the data exfiltration path and logs it.

PII & Secret Detection

Detects SSNs, credit card numbers, AWS access keys, private keys, JWTs, and API tokens in tool arguments. Heuristic or LLM-backed.

Untrusted Output Envelope

Wraps all tool responses in <untrusted-output> tags so the agent's context clearly delineates trusted vs. untrusted data.

Schema Validation

Validates tool arguments against JSON Schema with Ajv before execution. Malformed calls never reach the downstream tool.

Rate Limiting

Sliding-window rate limiter, configurable per-agent or per-tool. Prevents runaway agents from hammering downstream services.

Output Summarizer

For large tool outputs, optionally calls a fast LLM (Haiku, GPT-4o-mini) to summarize before passing to the agent. Full output is saved to a temp file.

Sandboxing

Same tool. Different safety envelope.

Expose multiple names for the same underlying capability with different security policies. Auto-approve the sandboxed variant. Gate the full-power version behind approval. Reduce approval fatigue without giving up control.

sandbox_presets:
  local_transform:
    filesystem:
      allow_read: ['.']
      allow_write: ['/tmp']
      deny_read: ['~/.ssh', '~/.aws', '.env']
    network:
      allowed_domains: []            # No network access

agents:
  claude-code:
    allow:
      - python/sandboxed              # Auto-approved, local-only
    ask:
      - python/full                   # Needs approval, can pip install

    tool_overrides:
      python/sandboxed:
        alias_of: exec/run
        description: "Run Python for local transforms only"

      python/full:
        alias_of: exec/run
        description: "Full Python with network after approval"
        sandbox:
          filesystem:
            allow_write: ['.', '/tmp']
          network:
            allowed_domains: ['pypi.org']

OS-Level Sandbox

The built-in python/eval tool runs code inside macOS sandbox-exec — kernel-enforced denial of filesystem writes and network access. Not just config-level restrictions.

Reusable Presets

Define presets like local_transform, github_only, or npm_registry once. Apply agent-wide or per-tool variant. Deny lists are additive.

Approval Visibility

When a sandboxed tool needs approval, the notification includes the resolved sandbox context — which presets applied, which domains are allowed, where writes are permitted.

Configuration

One file. Full control.

Define your downstream MCPs, agent profiles, approval rules, and security policies in a single YAML file. Hot-reloads on save.

agents:
  helena:              # Personal assistant agent
    allow:
      - filesystem/read_file
      - filesystem/list_directory
      - github/*           # Wildcard: all GitHub tools
      - notion/*
      - http/get
      - exec/run
    ask:                 # These require your approval
      - github/create_pr
      - github/push
      - notion/update_page
      - http/post

  monitoring:           # Read-only agent, no approval needed
    allow:
      - sentry/list_issues
      - posthog/*
      - http/get
    ask: []

providers:
  filesystem:
    type: stdio
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"]

  github:
    type: stdio
    command: npx
    args: ["-y", "@modelcontextprotocol/server-github"]
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: ${GITHUB_TOKEN}

  exec: builtin
  http: builtin

agents:
  helena:
    exec:
      allow:              # Auto-approved commands
        - "git status"
        - "git diff *"
        - "git log *"
        - "ls *"
        - "cat *"
      ask:                # Require approval
        - "git commit *"
        - "git push *"
        - "npm *"
      deny:               # Always blocked
        - "rm -rf *"
        - "sudo *"
        - "curl *"

approvals:
  timeout_ms: 300000      # 5 min default
  batch_window_ms: 10000
  provider:
    type: telegram
    bot_token: ${TELEGRAM_BOT_TOKEN}
    chat_id: ${TELEGRAM_CHAT_ID}

  # Or use Slack
  # provider:
  #   type: slack
  #   webhook_url: ${SLACK_WEBHOOK}

  # Or use the TUI for local dev
  # provider:
  #   type: tui

profiles:
  readonly:                # Reusable permission set
    allow:
      - github/list*
      - github/get*
      - http/get

  developer:
    allow:
      - github/*
      - git/*
      - exec/run
    ask:
      - github/create_pr
      - github/merge_pull_request

agents:
  claude-code:
    extends: [readonly]       # Inherits readonly permissions

  helena:
    extends: [readonly, developer]  # Merge with deny > ask > allow
    deny:
      - exec/run              # Override: deny always wins

middleware:
  injection_detector:
    backend: regex           # or "deberta" for ML classifier
    mode: escalate            # detect | mangle | escalate

  sensitivity_classifier:
    mode: detect              # detect | escalate
    threshold: 0.7

  canary_tokens: true       # Detect data exfiltration

  output_injection:
    mode: mangle              # Redact injection in responses

  untrusted_envelope: true  # Wrap outputs in <untrusted-output>

  rate_limiter:
    max_requests: 100
    window_ms: 60000
    per: agent                # or "tool"

  output_summarizer:
    model: claude-haiku-4-5-20251001
    threshold_chars: 10000

Quickstart

Running in 2 minutes

1

Create your config

# airlock.yaml — start with the example and customize
curl -o airlock.yaml https://raw.githubusercontent.com/airlock-dev/airlock/main/examples/airlock.yaml

2

Start the gateway

# SSE mode (for network clients like Cursor, OpenClaw)
npx airlock-bot --config airlock.yaml

# Or stdio mode (for Claude Code)
npx airlock-bot --agent claude-code --config airlock.yaml

3

Point your agent at Airlock

# In your MCP client config, connect to the gateway:
http://localhost:4111/sse?agent=helena  # SSE endpoint

Use Cases

Built for how you actually use agents

OpenClaw Agent

Route your OpenClaw instance through Airlock. Gate shell commands, require approval before sending messages, and block skills from exfiltrating data to unknown domains.

Coding Agent

Give Claude Code full filesystem and git access, but gate git push, rm, and deployments behind human approval.

Monitoring Bot

A read-only agent that watches Sentry and PostHog — no write tools exposed, no HITL needed, fully autonomous.

Research Agent

An agent that can fetch any URL via HTTP GET but has zero access to write tools, exec, or internal services.

CI / Hook Integration

Non-MCP tools call the /hook endpoint for policy decisions. CI scripts, Claude Code hooks, or custom integrations get the same allowlist and approval engine.

Management API

Query the audit log, list pending approvals, approve or deny via REST. Endpoints: /health, /audit, /hitl/pending, /hitl/approve/:id. Bearer token auth.

Security

Defense in depth

Tool Hiding

Unauthorized tools are omitted from the manifest. The agent doesn't know they exist.

Fail-Closed

Unrecognized tool calls and unmatched commands are denied by default.

Self-Approval Prevention

Agents can't reach the approval API — localhost is blocked for HTTP tools and curl is denied in exec policies.

Host Blocking

Built-in HTTP tools respect a blocked hosts list that prevents access to local network, internal services, and the gateway itself.

Tool Poisoning Defense

Downstream tool descriptions are sanitized. Override descriptions in config to prevent prompt injection via tool manifests.

Crash Recovery

Pending HITL approvals are persisted to SQLite. Restart the gateway and pick up where you left off.

Canary Tokens

Invisible markers injected into tool outputs detect data exfiltration when they appear in subsequent tool inputs.

Injection Detection

Scans tool arguments and responses for prompt injection patterns. Regex or ML-backed (DeBERTa).

Secret & PII Detection

Detects API keys, AWS credentials, SSNs, credit cards, JWTs, and private keys in tool arguments before execution.

Constant-Time Auth

API secret comparison uses timingSafeEqual, not string equality. No timing side-channel leaks.

Give your agents guardrails,
not handcuffs.

Airlock is open source and free. Set it up in minutes.

View on GitHub

npx airlock-bot --config airlock.yaml

Agents have the keys.But there's no lock.

Prompt Injection

Agent Error

All or Nothing Access

One gateway. Total control.

Per-Agent Allowlists

Human-in-the-Loop Approval

Full Audit Trail

Built-in HTTP & Exec Tools

Auto-Discovery

Composable Profiles

Sandbox Presets & Tool Variants

Middleware Pipeline

Client Agnostic

One Config File

Three layers of defense

Visibility Control

Human Approval Gate

Full Audit Trail

Sits between agents and tools

Approve from anywhere

Approve from your menu bar

Don't write YAML by hand

CLI Discovery

API Discovery

Interactive TUIs

Every tool call passes through the gauntlet

Prompt Injection Detection

Output Injection Scanning

Canary Tokens

PII & Secret Detection

Untrusted Output Envelope

Schema Validation

Rate Limiting

Output Summarizer

Same tool. Different safety envelope.

OS-Level Sandbox

Reusable Presets

Approval Visibility

One file. Full control.

Running in 2 minutes

Create your config

Start the gateway

Point your agent at Airlock

Built for how you actually use agents

OpenClaw Agent

Coding Agent

Monitoring Bot

Research Agent

CI / Hook Integration

Management API

Defense in depth

Tool Hiding

Fail-Closed

Self-Approval Prevention

Host Blocking

Tool Poisoning Defense

Crash Recovery

Canary Tokens

Injection Detection

Secret & PII Detection

Constant-Time Auth

Give your agents guardrails,not handcuffs.

Agents have the keys.
But there's no lock.

Give your agents guardrails,
not handcuffs.