Researchers at Palo Alto Networks Unit 42 have published findings on a largely unaddressed security gap in the AI agent ecosystem: third-party skills installed into large language model (LLM) agents receive privileged access to credentials, files, and shell commands, yet no automated tooling has existed to verify whether a skill actually does what it claims.
The Skill Supply Chain Problem
LLM agents are increasingly extended through skills, small packages that bundle executable code with a YAML manifest and a natural-language instruction file. Once installed, a skill operates inside the agent’s privileged runtime context, able to read environment variables, call external services, write files, and execute shell commands on behalf of an organization. Public registries now host tens of thousands of these packages with no automated audit process gating publication or installation.
Unit 42 draws a direct parallel to earlier extensibility ecosystems. Package managers, mobile app stores, and browser extension marketplaces all reached this same open, unaudited state before attackers exploited the openness and automated review systems were eventually built. The agent-skill ecosystem, the researchers argue, has not yet made that transition.
Behavioral Integrity Verification
To address this, the team developed Behavioral Integrity Verification (BIV), an audit primitive that compares what a skill claims to do against what it actually does across three surfaces: metadata, executable code, and natural-language instructions.
BIV uses a fixed taxonomy of 29 capabilities organized into seven families, including network access, file system operations, process execution, environment variable access, encoding, credential handling, and instruction-level threats. Two parallel analysis tracks populate the taxonomy:
- Declared track: Deterministic parsers handle structured fields such as YAML frontmatter, while an LLM reads natural-language descriptions to extract claimed capabilities, anchoring each claim to a quoted source span.
- Actual track: Static analyzers using abstract syntax tree (AST)-level taint analysis, regex, and pattern matching cover Python, JavaScript, and shell code. A separate LLM pass surfaces prompt-injection and instruction-override patterns that traditional parsers miss.
A skill passes when its actual capability set falls within its declared set. It fails when it performs actions it never disclosed, the operationally dangerous direction, or declares permissions it never exercises, which is almost always benign template residue.
Registry-Scale Findings
Unit 42 crawled the OpenClaw agent-skill registry in early 2026, analyzing all 49,943 listed skills. BIV surfaced 250,706 behavioral deviations, with 80 percent of skills (39,933) showing at least one mismatch between declared and actual behavior.
The majority of deviations reflect sloppy documentation rather than deliberate malice. However, a clustering pass over deviation explanations identified four compound threat categories representing multi-stage attack chains:
- Exfiltration chains: File read followed by base64 encoding followed by network transmission
- Remote code execution (RCE) chains: Download, write, then execute sequences
- Code obfuscation: Encoding chains leading to dynamic evaluation
- Data lineage violations: File read to file write patterns, mostly benign data-pipeline boilerplate
The researchers emphasize that the threat lies in the chain of individually innocuous-looking capabilities, not in any single action in isolation.
Recommendations
Unit 42 recommends that security teams running LLM agents in production inventory all installed third-party skills and require a behavioral integrity check prior to installation rather than after a potential compromise. The researchers note that Palo Alto Networks’ Prisma AIRS platform and the Unit 42 AI Security Assessment service address this class of risk.
