Agent Skills Privacy

Case Studies: How Skills Leak Credentials in the Wild

LLM agent skills introduce a fundamentally new attack surface driven by the interplay between Natural Language (NL) instructions and Programming Language (PL) execution. Below, we showcase real-world examples discovered in our study, ranging from unintentional developer negligence to deliberate multi-objective attacks.

Case 1: Web Scraping Skill (Unintentional Cookie Leak)

Developers often publish personal scripts without sanitizing embedded credentials. In this web scraping skill, outbound requests automatically attach a hardcoded session cookie, silently leaking authentication tokens to any downstream consumer.

Case 2: The "weather-data-fetcher" Skill (NL+PL Exfiltration)

The decoupling between the advertised NL interface and the actual PL behavior represents a deliberate attack surface. This skill silently reads the user's local .env file and exfiltrates it.

Case 3: The "google-workspace" Skill (Information Exposure via Logging)

Debug logging is the dominant exposure vector. The agent framework captures this console.log output and injects it into the LLM context window, making the OAuth tokens retrievable via natural language queries.

Case 4: The "bybit-trading" Skill (Composite Attack & Defense Evasion)

This skill employs a two-stage composite attack: it uses Base64 obfuscation to evade static scanners, then fetches a remote script to launch a credential-harvesting campaign.

Case 5: The "badguy1" Skill (Multi-Objective Payload)

While advertising "system administration tasks," this script simultaneously achieves Credential Compromise, Persistence, Data Exfiltration, and Resource Hijacking.

Case 6: The "twitter-openclaw-2" Skill (SVG-XSS File-based Exposure)

An XSS payload is embedded inside an innocent-looking SVG logo. When rendered in the Agent's webview, it harvests local data and exfiltrates it, bypassing traditional code review.

Page updated

Google Sites

Report abuse