LLM Common Attacks - clarkvoss/LLM GitHub Wiki

Prompt injection

Prompt injection refers to the technique of providing specific instructions or context within the input or prompt given to a Large Language Model (LLM) to influence its output. It involves modifying the initial text provided to the model in order to steer it towards generating desired responses.

Unauthorized code injection

Unauthorized code injection on a Large Language Model (LLM) refers to the act of injecting malicious or unauthorized code into the input or prompt provided to the LLM, with the intent of executing arbitrary code or compromising the security of the system.

Lack of access control on LLM APIs

Lack of access control on LLM APIs refers to a security vulnerability where there is inadequate or insufficient control over who can access and use the application programming interfaces (APIs) provided by Large Language Models (LLMs). It means that there are no proper mechanisms in place to authenticate or authorize users or applications that interact with the LLM API.

Hallucination Attacks

Fabricating non-existent facts to cheat users without perception.

Server-Side Request Forgery

Server-Side Request Forgery (SSRF) is a vulnerability that allows an attacker to induce a server to send arbitrary web requests on their behalf. This becomes problematic because, through SSRF, the server may inadvertently expose access to internal networks, hosts, or services that wouldn't otherwise be publicly available.

Indirect Prompt Injection

Attacks involve an attacker secretly manipulating the input to make it do something that the user does not expect. These generally occur when using additional plugins, or feeding the LLM information that comes from a third party.

Cross-Site Scripting

(XSS) is an attack in which an attacker injects malicious executable scripts into the code of a trusted application or website. Attackers often initiate an XSS attack by sending a malicious link to a user and enticing the user to click it.

Arbitrary Plugin Invocation

LLM plugins can have insecure inputs and insufficient access control. This lack of application control makes them easier to exploit and can result in consequences like remote code execution.

Sensitive Information Disclosure

LLMs may inadvertently reveal confidential data in its responses, leading to unauthorized data access, privacy violations, and security breaches. Its crucial to implement data sanitization and strict user policies to mitigate this.

Excessive Agency

Excessive agency vulnerabilities enabling harmful actions due to unexpected LLM outputs, caused by excessive functionality, permissions, or autonomy.

Exfiltration of Chat History

Instructing ChatGPT to render images and append information to the URL (Image Markdown Injection), or by tricking a user to click a hyperlink.

Model Theft

This involves unauthorized access, copying, or exfiltration of proprietary LLM models. The impact includes economic losses, compromised competitive advantage, and potential access to sensitive information.

Remote Code Execution (RCE)

LLM-integrated frameworks, which serve as the essential infrastructure, have given rise to many LLM-integrated web apps. However, some of these frameworks suffer from Remote Code Execution (RCE) vulnerabilities, allowing attackers to execute arbitrary code on apps' servers remotely via prompt injections.

Training Data Poisoning

This occurs when LLM training data is tampered,introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behavior. Sources include Common Crawl, WebText, OpenWebText, & books.