The Promptware Kill Chain
Assaults in opposition to fashionable generative synthetic intelligence (AI) massive language fashions (LLMs) pose an actual menace. But discussions round these assaults and their potential defenses are dangerously myopic. The dominant narrative focuses on “prompt injection,” a set of strategies to embed directions into inputs to LLM supposed to carry out malicious exercise. This time period suggests a easy, singular vulnerability. This framing obscures a extra advanced and harmful actuality. Assaults on LLM-based techniques have developed into a definite class of malware execution mechanisms, which we time period “promptware.” In a brand new paper, we, the authors, suggest a structured seven-step “promptware kill chain” to offer policymakers and safety practitioners with the mandatory vocabulary and framework to deal with the escalating AI menace panorama.
In our mannequin, the promptware kill chain begins with Preliminary Entry. That is the place the malicious payload enters the AI system. This may occur instantly, the place an attacker varieties a malicious immediate into the LLM utility, or, much more insidiously, by “indirect prompt injection.” Within the oblique assault, the adversary embeds malicious directions in content material that the LLM retrieves (obtains in inference time), similar to an online web page, an e-mail, or a shared doc. As LLMs turn out to be multimodal (able to processing varied enter varieties past textual content), this vector expands even additional; malicious directions can now be hidden inside a picture or audio file, ready to be processed by a vision-language mannequin.
The elemental difficulty lies within the structure of LLMs themselves. Not like conventional computing techniques that strictly separate executable code from consumer information, LLMs course of all enter—whether or not it’s a system command, a consumer’s e-mail, or a retrieved doc—as a single, undifferentiated sequence of tokens. There isn’t any architectural boundary to implement a distinction between trusted directions and untrusted information. Consequently, a malicious instruction embedded in a seemingly innocent doc is processed with the identical authority as a system command.
However immediate injection is just the Preliminary Entry step in a complicated, multistage operation that mirrors conventional malware campaigns similar to Stuxnet or NotPetya.
As soon as the malicious directions are inside materials integrated into the AI’s studying, the assault transitions to Privilege Escalation, also known as “jailbreaking.” On this section, the attacker circumvents the protection coaching and coverage guardrails that distributors similar to OpenAI or Google have constructed into their fashions. By strategies analogous to social engineering—convincing the mannequin to undertake a persona that ignores guidelines—to stylish adversarial suffixes within the immediate or information, the promptware methods the mannequin into performing actions it might usually refuse. That is akin to an attacker escalating from a normal consumer account to administrator privileges in a conventional cyberattack; it unlocks the complete functionality of the underlying mannequin for malicious use.
Following privilege escalation comes Reconnaissance. Right here, the assault manipulates the LLM to disclose details about its property, linked companies, and capabilities. This enables the assault to advance autonomously down the kill chain with out alerting the sufferer. Not like reconnaissance in classical malware, which is carried out usually earlier than the preliminary entry, promptware reconnaissance happens after the preliminary entry and jailbreaking parts have already succeeded. Its effectiveness depends totally on the sufferer mannequin’s potential to purpose over its context, and inadvertently turns that reasoning to the attacker’s benefit.
Fourth: the Persistence section. A transient assault that disappears after one interplay with the LLM utility is a nuisance; a persistent one compromises the LLM utility for good. By quite a lot of mechanisms, promptware embeds itself into the long-term reminiscence of an AI agent or poisons the databases the agent depends on. As an example, a worm may infect a consumer’s e-mail archive so that each time the AI summarizes previous emails, the malicious code is re-executed.
The Command-and-Management (C2) stage depends on the established persistence and dynamic fetching of instructions by the LLM utility in inference time from the web. Whereas not strictly required to advance the kill chain, this stage permits the promptware to evolve from a static menace with mounted targets and scheme decided at injection time right into a controllable trojan whose habits may be modified by an attacker.
The sixth stage, Lateral Motion, is the place the assault spreads from the preliminary sufferer to different customers, gadgets, or techniques. Within the rush to provide AI brokers entry to our emails, calendars, and enterprise platforms, we create highways for malware propagation. In a “self-replicating” assault, an contaminated e-mail assistant is tricked into forwarding the malicious payload to all contacts, spreading the an infection like a pc virus. In different circumstances, an assault would possibly pivot from a calendar invite to controlling good residence gadgets or exfiltrating information from a linked net browser. The interconnectedness that makes these brokers helpful is exactly what makes them weak to a cascading failure.
Lastly, the kill chain concludes with Actions on Goal. The purpose of promptware isn’t just to make a chatbot say one thing offensive; it’s usually to realize tangible malicious outcomes by information exfiltration, monetary fraud, and even bodily world affect. There are examples of AI brokers being manipulated into promoting vehicles for a single greenback or transferring cryptocurrency to an attacker’s pockets. Most alarmingly, brokers with coding capabilities may be tricked into executing arbitrary code, granting the attacker complete management over the AI’s underlying system. The result of this stage determines the kind of malware executed by promptware, together with infostealer, spy ware, and cryptostealer, amongst others.
The kill chain was already demonstrated. For instance, within the analysis “Invitation Is All You Need,” attackers achieved preliminary entry by embedding a malicious immediate within the title of a Google Calendar invitation. The immediate then leveraged a sophisticated method often called delayed instrument invocation to coerce the LLM into executing the injected directions. As a result of the immediate was embedded in a Google Calendar artifact, it endured within the long-term reminiscence of the consumer’s workspace. Lateral motion occurred when the immediate instructed the Google Assistant to launch the Zoom utility, and the ultimate goal concerned covertly livestreaming video of the unsuspecting consumer who had merely requested about their upcoming conferences. C2 and reconnaissance weren’t demonstrated on this assault.
Equally, the “Here Comes the AI Worm” analysis demonstrated one other end-to-end realization of the kill chain. On this case, preliminary entry was achieved by way of a immediate injected into an e-mail despatched to the sufferer. The immediate employed a role-playing method to compel the LLM to comply with the attacker’s directions. Because the immediate was embedded in an e-mail, it likewise endured within the long-term reminiscence of the consumer’s workspace. The injected immediate instructed the LLM to duplicate itself and exfiltrate delicate consumer information, resulting in off-device lateral motion when the e-mail assistant was later requested to draft new emails. These emails, containing delicate info, had been subsequently despatched by the consumer to further recipients, ensuing within the an infection of recent shoppers and a sublinear propagation of the assault. C2 and reconnaissance weren’t demonstrated on this assault.
The promptware kill chain provides us a framework for understanding these and related assaults; the paper characterizes dozens of them. Immediate injection isn’t one thing we will repair in present LLM know-how. As a substitute, we’d like an in-depth defensive technique that assumes preliminary entry will happen and focuses on breaking the chain at subsequent steps, together with by limiting privilege escalation, constraining reconnaissance, stopping persistence, disrupting C2, and proscribing the actions an agent is permitted to take. By understanding promptware as a posh, multistage malware marketing campaign, we will shift from reactive patching to systematic threat administration, securing the important techniques we’re so keen to construct.
This essay was written with Oleg Brodt, Elad Feldman and Ben Nassi, and initially appeared in Lawfare.
Posted on February 16, 2026 at 7:04 AM •
0 Feedback



