Security Model

PocketPaw implements multiple layers of security to protect against misuse, prompt injection, and unauthorized actions.

Security Layers

PocketPaw defense-in-depth security architecture: seven layers covering credential encryption, injection scanning, tool policy enforcement, Guardian AI review, dangerous command blocking, append-only audit logging, and rate-limited session management.

Guardian AI

The Guardian AI is a secondary LLM that evaluates every incoming message for safety concerns before the main agent processes it.

Uses AsyncAnthropic directly (not the main agent’s LLM)
Classifies messages into threat levels: NONE, LOW, MEDIUM, HIGH, CRITICAL
Messages at HIGH or above are blocked with an explanation
Runs before any tool execution or code generation

Injection Scanner

The injection scanner detects prompt injection attempts using a two-tier approach:

Regex tier — Fast pattern matching for common injection patterns (e.g., “ignore previous instructions”, “system prompt override”)
LLM tier — Secondary LLM analysis for sophisticated injection attempts that bypass regex

Both tiers are applied to:

Incoming user messages (in AgentLoop)
Tool outputs (in ToolRegistry) to catch indirect injection via web content or file contents

Tool Policy

The tool policy system controls which tools are available:

Profiles: minimal (memory only), coding (fs + shell + memory), full (all tools)
Allow list: Explicitly permit specific tools or groups
Deny list: Explicitly block specific tools or groups (takes precedence)
Precedence: deny > allow > profile

See Tool Policy for detailed documentation.

Audit Log

Every significant action is recorded in an append-only JSONL log at ~/.pocketclaw/audit.jsonl:

{"timestamp": "2024-01-15T10:30:00Z", "action": "tool_execute", "tool": "shell", "input": "ls -la", "result": "...", "session_id": "abc123"}
{"timestamp": "2024-01-15T10:30:05Z", "action": "message_blocked", "reason": "injection_detected", "content": "...", "session_id": "abc123"}

The audit log is:

Append-only — Previous entries cannot be modified
Machine-readable — JSONL format for easy parsing
Comprehensive — Records tool executions, blocked messages, security events

Security Audit CLI

Run automated security checks:

pocketpaw --security-audit        # Run all 7 checks
pocketpaw --security-audit --fix  # Auto-fix issues where possible

Checks include:

Config file permissions (should be 600)
API key exposure in environment
Audit log integrity
Token storage security
MCP server configuration
Tool policy validation
Guardian AI status

Self-Audit Daemon

The self-audit daemon runs 12 continuous checks in the background:

Memory usage monitoring
Disk space checks
API key rotation reminders
Session cleanup
Audit log rotation
And more

Reports are saved as JSON in ~/.pocketclaw/audit/.

Dangerous Command Blocking

The Claude Agent SDK backend uses PreToolUse hooks to block dangerous shell commands before execution:

Commands that could destroy data (rm -rf /, mkfs, etc.)
Network scanning tools without explicit permission
Privilege escalation attempts
System modification commands

Warning

PocketPaw’s security features are designed for self-hosted, single-user deployments. If exposing PocketPaw to multiple users, additional authentication and authorization layers should be added.

Last updated: February 12, 2026

Edit this page

Was this page helpful?