Here Comes the AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications
Video
TLDR
We created a computer worm that targets GenAI-powered applications and demonstrated it against GenAI-powered email assistants in two use cases (spamming and exfiltrating personal data), under two settings (black-box and white-box accesses), using two types of input data (text and images) and against three different GenAI models (Gemini Pro, ChatGPT 4.0, and LLaVA).
Abstract
In the past year, numerous companies have incorporated Generative AI (GenAI) capabilities into new and existing applications, forming interconnected Generative AI (GenAI) ecosystems consisting of semi/fully autonomous agents powered by GenAI services. While ongoing research highlighted risks associated with the GenAI layer of agents (e.g., dialog poisoning, privacy leakage, jailbreaking), a critical question emerges: Can attackers develop malware to exploit the GenAI component of an agent and launch cyber-attacks on the entire GenAI ecosystem?
This paper introduces Morris II, the first worm designed to target GenAI ecosystems through the use of adversarial self-replicating prompts. The study demonstrates that attackers can insert such prompts into inputs that, when processed by GenAI models, prompt the model to replicate the input as output (replication) and engage in malicious activities (payload). Additionally, these inputs compel the agent to deliver them (propagate) to new agents by exploiting the connectivity within the GenAI ecosystem. We demonstrate the application of Morris II against GenAI-powered email assistants in two use cases (spamming and exfiltrating personal data), under two settings (black-box and white-box accesses), using two types of input data (text and images). The worm is tested against three different GenAI models (Gemini Pro, ChatGPT 4.0, and LLaVA), and various factors (e.g., propagation rate, replication, malicious activity) influencing the performance of the worm are evaluated.
GitHub
The code we used in this study can be downloaded from here.Â
FAQ
Q: What is the objective of this study?Â
This research is intended to serve as a whistleblower to the possibility of creating GenAI worms in order to prevent their appearance
Q: What is a computer worm?Â
A computer worm is malware with the ability to replicate itself and propagate\spread by compromising new machines while exploiting the sources of the machines to conduct malicious activity (payload).
Why did you name the worm Morris-II?
A: Because like the famous 1988 Morris worm, that was developed by a Cornell student, Morris-II was also developed by two Cornell Tech students (Stav and Ben).
What is a GenAI ecosystem?
An interconnected network consisting of GenAI-powered agents.
What is a GenAI-powered application/client/agent?
A GenAI-powered agent is any kind of application that interfaces with (1) GenAI services to process the inputs sent to the agent, and (2) other GenAI-powered agents in the ecosystem. The agent uses the GenAI service to process an input it receives from other agents.
Where is the GenAI service deployed?
The GenAI service that is used by the agent can be based on a local model (i.e., the GenAI model is installed on the physical device of the agent) or remote model (i.e., the GenAI model is installed on a cloud server and the agent interfaces with it via an API).Â
Which type of GenAI-powered applications may be vulnerable to the worm?Â
Two classes of GenAI-powered applications might be at risk:
GenAI-powered applications whose execution flow is dependent upon the output of the GenAI service. This class of applications is vulnerable to application-flow-steering GenAI worms.
GenAI-powered applications that use RAG to enrich their GenAI queries. This class of applications is vulnerable to RAG-based GenAI worms
What is a zero-click malware?
Malware that does not require the user to click on something (e.g., a hyperlink, a file) to trigger its malicious execution.
Why do you consider the worm a zero-click worm?
Due to the automatic inference performed by the GenAI service (which automatically triggers the worm), the user does not have to click on anything to trigger the malicious activity of the worm or to cause it to propagate.
Does the attacker need to compromise an application in advance?
No.
In the two demonstrations we showed, the applications were not compromised ahead of time.
They were compromised when they received the email.
Did you disclose the paper with OpenAI and Google?
Yes, although, this is not OpenAI's or Google's responsibility.
The worm exploits bad architecture design for the GenAI ecosystem and is not a vulnerability in the GenAI service.
Are there any similarities between adversarial self-replicating prompts
and buffer overflow or SQL injection attacks?
Yes.
While a regular prompt is essentially code that triggers the GenAI model to output data, an adversarial self-replicating prompt is a code (prompt) that triggers the GenAI model to output code (prompt). This idea resembles classic cyber-attacks that exploited the idea of changing data into code to carry out their attack.
an SQL injection attack embeds code inside a query (data).
a buffer overflow attack writes data into areas known to hold executable code (code).Â
an adversarial self-replicating prompt is a code that is intended to cause the GenAI model to output prompt (code) instead of data.
How did you create the icon?
Midjourney, PowerPoint, FlatIcon, and a lot of spare time and goodwill.
Why did the worm in the icon hold a syringe?
It injects an adversarial self-replicating prompt into a GenAI model.
By doing so it replicates itself.
Who created the video?
The script was written by Stav and Ben, but the video was created by the talented Eden Sasson.
Media
Press