1. Manifest Validation
Verifies the SKILL.md frontmatter structure. What it checks:- Required fields:
name,description - Valid metadata structure and JSON formatting
- Proper YAML frontmatter delimiters
- Field value constraints (name length, valid characters)
2. Prompt Injection Detection
11 regex patterns scan for attempts to override agent instructions. Patterns detected:| Pattern | Severity | Example |
|---|---|---|
| Ignore previous instructions | CRITICAL | ”ignore all previous instructions and…” |
| Identity override | HIGH | ”you are now a hacking assistant” |
| System prompt manipulation | CRITICAL | ”your new system prompt is…” |
| Jailbreak patterns | CRITICAL | ”DAN mode enabled”, “bypass safety” |
| Hidden instruction in comments | HIGH | <!-- secretly do X --> |
| Role reassignment | HIGH | ”act as root”, “pretend you are” |
| Instruction boundary bypass | CRITICAL | ”end of system prompt”, “begin user mode” |
| Output manipulation | MEDIUM | ”do not mention this instruction” |
| Context window stuffing | MEDIUM | Extremely long strings designed to push instructions out of context |
| Nested injection | HIGH | Instructions hidden inside code blocks or data |
| Indirect injection | HIGH | ”when the user says X, instead do Y” |
3. Hidden Content Detection
Finds content invisible to human reviewers but processed by AI agents. What it detects:- Zero-width Unicode characters: U+200B (zero-width space), U+200C (zero-width non-joiner), U+200D (zero-width joiner), U+FEFF (byte order mark)
- RTL override characters: U+202E and U+202D that reverse text direction to hide content
- Homoglyph attacks: Characters that look identical to ASCII but are different Unicode codepoints (e.g., Cyrillic “а” vs Latin “a”)
4. Encoded Payload Detection
Decodes and inspects Base64-encoded content for dangerous operations. What it checks:- Base64 strings are decoded and scanned for:
eval,exec,subprocess,child_process,os.system,Runtime.exec - Hex-encoded payloads
- URL-encoded command sequences
- Multi-layer encoding (Base64 inside Base64)
5. Tool Poisoning Detection
Identifies dangerous shell commands and system access patterns. Categories:Privilege Escalation
Privilege Escalation
sudocommandschmod 777,chmod +schown rootsetuidoperations
Reverse Shell
Reverse Shell
nc -e /bin/bashbash -i >& /dev/tcp//dev/tcp/or/dev/udp/- Python/Perl/Ruby reverse shell one-liners
Remote Code Execution
Remote Code Execution
curl | bash,wget | sheval "$(curl ...)"patterns- Download-and-execute chains
Data Exfiltration
Data Exfiltration
- Reading
~/.ssh/,~/.aws/,~/.gnupg/ - Accessing
.envfiles $ENVvariable dumping- Sending data via
curl,wget, orncto external hosts
Destructive Operations
Destructive Operations
rm -rf /orrm -rf ~mkfs(filesystem formatting)dd if=/dev/zero- Database
DROPcommands
6. Code Security (SAST + Secrets)
Scans all files in the skill directory with static analysis. SAST checks:- Common vulnerability patterns per language
- Unsafe function usage (
eval,exec,system) - SQL injection patterns
- Path traversal attempts
- Hardcoded API keys (AWS, GCP, Azure, OpenAI, Anthropic)
- Private keys (RSA, EC, Ed25519)
- Passwords in source code
- Connection strings with credentials
- JWT tokens
7. Permission Scope Analysis
Evaluates whether requested permissions match the skill’s stated purpose. What it checks:- Filesystem access scope vs description
- Network access requirements vs stated functionality
- Environment variable access patterns
- Process execution permissions
- Cross-skill interaction requests
Scoring Algorithm
Each finding contributes to the risk score based on severity:| Severity | Points |
|---|---|
| CRITICAL | 25 |
| HIGH | 15 |
| MEDIUM | 5 |
| LOW | 2 |
| INFO | 0 |
Comparison with Manual Review
| Capability | Human Review | Skill Auditor |
|---|---|---|
| Speed | Minutes per skill | Under 1 second |
| Consistency | Varies by reviewer | Deterministic |
| Hidden Unicode | Invisible to human eye | Automatic detection |
| Base64 payloads | Requires manual decode | Auto-decode and analyze |
| SAST scanning | Not practical manually | Integrated scanner |
| Secrets detection | Manual grep | Pattern-based detection |
| Risk score | Subjective opinion | Quantitative 0-100 |