LLMs vulnerable to prompt injection attacks
As we've already seen today AI systems are becoming increasingly popular targets for attack.
New research from Snyk and Lakera looks at the risks to AI agents and LLMs from prompt injection attacks.
Agents offer a flexible and convenient way to connect multiple application components such as data stores, functions, and external APIs to an underlying LLM in order to build a system that takes advantage of machine learning models to quickly solve problems and add value.
Prompt injection is a new variant of an injection attack, where user-provided input is reflected directly into a format such that the processing system can't distinguish between what was provided by the developer and the user.
OK, now we understand the terms we can look at why these attacks are such an issue. Successful prompt injection attacks are usually kept within the LLM, but where an agent is involved that allows the AI to execute code or call an external API they can have more severe consequences.
LLMs are vulnerable to this type of attack because the risk can't be fully addressed at the model level, but needs prompt defense solutions to be incorporated into agent architectures.
The researchers note, "Agent-based systems need to consider traditional vulnerabilities as well as the new vulnerabilities that are introduced by LLMs. User prompts and LLM output should be treated as untrusted data, just like any user input in traditional web application security, and need to be validated, sanitized, escaped, etc., before being used in any context where a system will act based on them. Prompt defenses are required to identify and prevent prompt injection attacks, and other AI specific vulnerabilities, in any LLM input or output."
You can see more detail about how this type of attack works on the Snyk blog.
Image credit: Lighthunter/depositphotos.com