Agent Skills Privacy

Evaluation & Ecosystem Measurement

Dataset Overview

170,226: Total skills populated from SkillsMP.
17,022: Skills randomly sampled for in-depth analysis (10%).
520: Affected skills confirmed containing leakage.
1,708: Total security issues identified.
437: Vulnerable skills (Developer Negligence).
83: Malicious skills (Deliberate Exploits).

Major Findings

Finding 1: Credential leakage is dominated by unintentional vulnerabilities (84.0%), most acute in Web Scraping (17.1%), where developers publish personal scripts without sanitizing embedded credentials.
Finding 2: Natural language has become a weaponized attack vector: 76.3% of cases require NL+PL triggering, while 3.1% exploit NL alone through prompt injection, which creates a semantic attack surface absent from traditional software security models.
Finding 3: Among 437 vulnerable skills, Information Exposure affects 352 skills (80.5%), primarily introduced through debug logging practices. Hardcoded Credentials affect 107 skills (24.5%), disproportionately linked to AI-assisted code generation workflows that lack security enforcement. Insecure Storage (77 skills) and Artifact Leakage (5 skills) indicate that developers underestimate the privileged execution context in which agent skills operate.
Finding 4: Malicious skills exploit trusted distribution channels such as GitHub and skill stores, with 37.3% combining multiple attack patterns to maximize impact while bypassing user trust barriers.
Finding 5: 89.6% of affected skills (466/520) are confirmed exploitable through runtime channels during normal execution; the remaining 54 contain hardcoded credentials that do not surface at runtime. Stdout leakage dominates (75.8%), exploitation concentrates in the execute phase (92.5%), and leaked credentials persist across downstream forks beyond upstream remediation.

Credential Leakage Landscape

Attacks are Deliberate, Not Accidental. This histogram shows the number of vulnerabilThe vulnerability distribution shows that Python is the most leakage-prone language (60.0%), and that credential exposure in agent skills is fundamentally cross-modal (76.3% requiring Code + NL interaction) ities per malicious skill. The distribution peaks at 4 vulnerabilities per skill, with 80.3% of skills containing 3 or more. This heavy layering confirms that these are intentional attacks combining multiple techniques, rather than simple developer errors.

Credential Leakage Pattern Taxonomy

We categorized the 520 confirmed cases into 10 distinct patterns: 4 arising from developer negligence and 6 from deliberate adversarial construction.

Real-World Impact

We reported all 520 affected skills to the SkillsMP platform maintainers, who acknowledged our findings and initiated remediation within 48 hours. All 83 confirmed malicious skills have since been permanently removed from the platform.

Page updated

Google Sites

Report abuse