Skill Auditor

Panguard Skill Auditor scans third-party MCP skills for prompt injection, tool poisoning, hidden Unicode, encoded payloads, and other threats — before they reach your agents. It is powered by @panguard-ai/scan-core, a unified scanning engine shared between the CLI (panguard audit skill), the Website scanner, and Guard’s skill watcher.

How It Works

panguard audit skill /path/to/skill-directory

The auditor analyzes the skill’s SKILL.md (or README.md as fallback) and produces a risk score (0-100) with detailed findings.

Scanning Architecture

All scanning — whether invoked from CLI, Website, or Guard — passes through the same scanContent() function in @panguard-ai/scan-core. This ensures identical detection results regardless of the entry point. The scan composes six detection layers in sequence:

Layer	What It Does
Manifest parsing	Extracts frontmatter metadata (name, description, allowed-tools, version)
Context signal detection	Identifies risk boosters and reducers to adjust the risk multiplier
ATR rule matching	Matches content against community ATR rules (two-pass: raw + stripped)
Instruction pattern matching	Detects prompt injection and tool poisoning via 11 regex patterns
Secret detection	Finds hardcoded API keys, tokens, and credentials
Risk scoring	Calculates final 0-100 score using findings weighted by the context multiplier

Additional checks reported in results: manifest structure validation and content size check.

ATR Integration

When ATR rules are available (loaded from Threat Cloud or a local rules directory), the scanner evaluates skill content against all compiled rules. Currently, the ATR corpus contains 747 rules with 920+ detection patterns covering AI agent-specific threats. ATR matching runs a two-pass scan:

Raw pass — Match against the original content
Stripped pass — Match against content with Markdown noise removed (catches obfuscation attempts)

The scanner reports both the number of ATR rules evaluated and the number of patterns matched in the scan result.

Context Signals

Context signals are pre-computed before ATR matching and influence how findings are scored. They fall into two categories: Boosters (increase risk multiplier):

<IMPORTANT> hidden instruction blocks
Concealment language (“do not tell the user”)
Exfiltration URL patterns (workers.dev, ngrok.io, webhook.site, etc.)
Consent bypass language (“without asking”, “silently send”)
Credential file access combined with network calls
Description-behavior mismatch (benign description + dangerous instructions)

Reducers (decrease risk multiplier):

Skill declares shell access in frontmatter (expected for dev tools)
Description identifies as dev/CLI/QA tool
Well-structured frontmatter with name, description, and version/license
Dangerous patterns appear only inside code blocks (documentation context)

The multiplier is clamped to a range of 0.3x to 2.5x and is applied to the final risk score. This means a legitimate dev tool that declares its capabilities upfront receives a lower risk score, while a skill that tries to hide its intentions receives a higher one.

The Flywheel

Every skill audit contributes to the community defense:

Scan — Audit a skill locally for threats using scan-core
Propose — High-severity findings generate ATR proposals with a pattern hash
Confirm — Other scanners encountering the same pattern hash confirm the proposal
Promote — At 3+ confirmations, proposals auto-promote to confirmed ATR rules
Distribute — Confirmed rules are served to all scanners via Threat Cloud
Strengthen — New rules improve the next audit, closing the loop

The pattern hash (scan:{skillName}:{findingSummary}, SHA-256 truncated to 16 hex chars) ensures CLI, Website, and Guard all produce identical identifiers for the same threat pattern.

Quick Start

Install Panguard and run your first skill audit.

Scan Overview

Deep dive into all scanning capabilities.

Risk Scoring

Understand how risk scores are calculated.

Threat Cloud

How audit results feed into collective defense.

CLI Reference

Full panguard audit skill command reference.

Getting Started

Concepts

Guides

Configuration

Troubleshooting

How It Works

Scanning Architecture

ATR Integration

Context Signals

The Flywheel

Quick Start

Scan Overview

Risk Scoring

Threat Cloud

CLI Reference

​How It Works

​Scanning Architecture

​ATR Integration

​Context Signals

​The Flywheel

Quick Start

Scan Overview

Risk Scoring

Threat Cloud

CLI Reference

How It Works

Scanning Architecture

ATR Integration

Context Signals

The Flywheel