Three-Layer AI Funnel
Panguard AI uses a three-layer cascading architecture to analyze security events. 90% of events are handled by the rule engine in under 1 millisecond. Only the most complex 3% ever reach cloud AI.Why Three Layers?
Sending every security event to an AI model creates three problems:- Too slow — AI inference takes seconds; attacks do not wait.
- Too expensive — Thousands of events per machine per day means runaway token costs.
- Unreliable — If the API goes down, protection stops.
Architecture Overview
Layer Comparison
| Property | Layer 1: Rules | Layer 2: Local AI | Layer 3: Cloud AI |
|---|---|---|---|
| Event share | ~90% | ~7% | ~3% |
| Latency | < 1 ms | < 5 s | < 30 s |
| Cost per event | $0 | $0 | ~$0.01 |
| Requires network | No | No | Yes |
| Technology | ATR Rules | Ollama (llama3) | Claude / OpenAI |
| Best for | Known attack patterns | Behavioral anomalies | Novel, complex threats |
Layer 1 — Rule Engine (90%)
Handles all known attack patterns with zero latency and zero cost.ATR Rules
ATR (Agent Threat Rules) is the open standard for AI agent threat detection. Panguard Guard ships with 61 ATR rules covering common AI agent attack patterns.ATR Rule Example
- Pattern matching with regex support
- Context-aware detection (tool responses, skill manifests, agent actions)
- Multi-layer detection: regex, content fingerprinting, LLM-as-judge
- Severity levels: critical, high, medium, low
- MITRE ATT&CK mapping for AI agent threats
Layer 2 — Local AI (7%)
When an event does not match any known rule but exhibits suspicious behavior, it is forwarded to a local AI model for analysis.- Runs locally via Ollama — no network required
- Zero API cost
- Inference latency approximately 3-5 seconds
- Default model:
llama3
Environment-aware routing: On servers (VPS, cloud instances), events flow through all three
layers. On desktops and laptops, Layer 2 is skipped to avoid competing for user resources.
Unmatched events go directly from Layer 1 to Layer 3.
Layer 3 — Cloud AI (3%)
The most complex unknown threats are analyzed by cloud AI with full dynamic reasoning.- Complete context analysis
- Cross-event correlation
- Attack chain reasoning with MITRE ATT&CK classification
- Remediation recommendation generation
Even if cloud AI is unavailable (network outage, token exhaustion), the Layer 1 rule engine
continues operating. Protection never stops.
Graceful Degradation
A critical design principle of the three-layer architecture: if any layer fails, the layer above it takes over automatically.| Scenario | Degradation Behavior |
|---|---|
| Cloud AI unavailable | Layer 2 (Local AI) takes over |
| Ollama not installed | Layer 1 (Rule Engine) takes over |
| Rule files corrupted | Built-in default rules activate |
Confidence Weighting by Available Sources
The system dynamically adjusts how much weight each evidence source carries based on what is available:| Sources Available | Rules/Intel | Baseline | AI | eBPF |
|---|---|---|---|---|
| Rules only | 0.60 | 0.40 | — | — |
| Rules + AI | 0.40 | 0.30 | 0.30 | — |
| Rules + eBPF | 0.40 | 0.35 | — | 0.25 |
| Rules + AI + eBPF | 0.30 | 0.20 | 0.30 | 0.20 |
FunnelRouter
TheFunnelRouter component in @panguard-ai/core implements the Layer 2 to Layer 3 fallback logic:
Evaluate Confidence
If Ollama returns a confident verdict, use it. If Ollama is unavailable or returns low
confidence, escalate.
Fall Back to Cloud AI
Send the event to Claude or OpenAI for deep reasoning and MITRE classification.
- Check
~/.panguard/llm.enc(encrypted local config, AES-256-GCM) - Check environment variables:
ANTHROPIC_API_KEY,OPENAI_API_KEY - Probe local Ollama at
http://localhost:11434 - Build the appropriate adapter: FunnelRouter (both available), single provider, or null
Related
Learning Mode
How Guard builds a behavioral baseline during the 7-day learning period.
Real-Time Protection
Set up Guard for continuous monitoring and automated response.