🚀Introducing Bright Star: AI-Powered, Autonomous Security Testing & Remediation! Learn more>>

Back to blog
Published: Jul 12th, 2023 /Modified: Mar 25th, 2025

OWASP Top 10 for LLM

Time to read: 5 min
Avatar photo
Edward Chopskie

The Open Worldwide Application Security Project (OWASP) has recently released their first draft version (Version 0.5) detailing the top 10 critical vulnerabilities commonly observed in large language model (LLM) applications. A LLM or large language model is a trained deep-learning model that understands and generates text in a human-like fashion. 

Practical applications of LLMs include OpenAI ChatGPT, GPT-4, Google BARD and Microsoft BING. These advanced AI models can understand and generate human-like text, which opens up endless possibilities for applications in various fields.

The vulnerabilities that OWASP has documented have been carefully selected based on their potential impact, exploitability, and prevalence within the LLM landscape. Some notable vulnerabilities included in the list are prompt injections, data leakage, inadequate sandboxing, and unauthorized code execution. Some, including prompt injections, can be executed with limited or no coding experience. 

Like other OWASP lists, the primary objective of this list is to serve as an educational resource for developers, designers, architects, managers, and organizations involved in the deployment and management of LLM applications. 

By highlighting these vulnerabilities, OWASP aims to raise awareness about the potential security risks associated with LLMs. Moreover, the report provides valuable insights into effective remediation strategies, with the ultimate goal of enhancing the overall security posture of LLM applications.

Here are the top 10 most critical vulnerabilities affecting LLM applications, according to OWASP.

1. Prompt Injection

Prompt injections pose a significant security concern, as highlighted by OWASP. They involve circumventing filters or manipulating LLMs through carefully constructed prompts. By doing so, attackers can deceive the model into disregarding prior instructions or executing unintended actions. This attack can lead the LLM to provide data that would be otherwise restricted. Examples include manipulating inputs to divulge data that would be restricted such as listing the ingredients for illegal drugs. 

2. Data leakage

Data leakage occurs when an LLM accidentally reveals sensitive information, proprietary algorithms, or other confidential details through its responses. “This can result in unauthorized access to sensitive data or intellectual property, privacy violations, and other security breaches,” according to  OWASP. Again, an attacker could deliberately probe the LLM with carefully crafted prompts in attempting to extract sensitive information.

3. Inadequate sandboxing

Insufficient sandboxing of a large language model (LLM) can result in significant security risks, including potential exploitation, unauthorized access, and unintended actions. When an LLM is not properly isolated from external resources or sensitive systems, it becomes susceptible to various vulnerabilities. OWASP has highlighted some common inadequate LLM sandboxing scenarios, such as the lack of proper separation between the LLM environment and critical systems or data stores, improper restrictions that grant the LLM access to sensitive resources, and the LLM performing system-level actions or interacting with other processes.

4. Unauthorized code execution

Unauthorized code execution occurs when an attacker exploits an LLM to execute malicious code, commands, or actions on the underlying system through natural language prompts. Common vulnerabilities include non-sanitized or restricted user input that allows attackers to craft prompts that trigger the execution of unauthorized code, insufficient restrictions on the LLM’s capabilities, and unintentionally exposing system-level functionality or interfaces to the LLM.

5. Server-side request forgery vulnerabilities

Server-side request forgery (SSRF) vulnerabilities pose a significant risk, as they can be exploited by attackers to manipulate a large language model (LLM) into performing unintended requests or gaining unauthorized access to restricted resources. OWASP has identified common causes of SSRF vulnerabilities including insufficient input validation and misconfigurations in network or application security settings, which can expose internal services, APIs, or data stores to the LLM.

6. Over Reliance on LLM-generated content

Over reliance on LLM-generated content can lead to the propagation of misleading or incorrect information, decreased human input in decision-making, and reduced critical thinking, according to OWASP. Common issues related to overreliance on LLM-generated content include accepting LLM-generated content as fact without verification, assuming LLM-generated content is free from bias or misinformation, and relying on LLM-generated content for critical decisions without human input or oversight. 

7. Inadequate AI alignment

Inadequate AI alignment occurs when the LLM’s objectives and behavior do not align with the intended use case, leading to undesired consequences or vulnerabilities. Poorly defined objectives resulting in the LLM prioritizing undesired/harmful behaviors, misaligned reward functions or training data creating unintended model behavior, and insufficient testing and validation of LLM behavior are common issues, OWASP wrote. For example, if an LLM designed to assist with system administration tasks is misaligned, it could execute harmful commands or prioritize actions that degrade system performance or security.

8. Insufficient access controls

Insufficient access controls occur when access controls or authentication mechanisms are not properly implemented, allowing unauthorized users to interact with the LLM and potentially exploit vulnerabilities. Failing to enforce strict authentication requirements for accessing the LLM, inadequate role-based access control (RBAC) implementation allowing users to perform actions beyond their intended permissions, and failing to provide proper access controls for LLM-generated content and actions are all common examples. 

9. Improper error handling

Improper error handling poses a significant security risk, as it can inadvertently expose sensitive information, system details, or potential attack vectors to threat actors. It occurs when error messages or debugging information are not properly handled or protected. OWASP has identified several common vulnerabilities related to error handling that can lead to security breaches. For example, one vulnerability is the exposure of sensitive information or system details through error messages. When error messages contain sensitive data or provide too much information about the system’s internal workings, attackers can exploit this information to gain insights into the system’s vulnerabilities or potential attack vectors. 

10. Training data poisoning

Training data poisoning refers to the manipulation of training data or fine-tuning procedures of a large language model (LLM) by attackers. This malicious activity aims to introduce vulnerabilities, backdoors, or biases that can compromise the security, effectiveness, or ethical behavior of the model, as explained by OWASP. Common issues related to training data poisoning include the introduction of backdoors or vulnerabilities into the LLM through manipulated training data and the injection of biases that cause the LLM to produce biased or inappropriate responses.

Further resources:

An excellent introduction to LLMs
GitHub’s introduction to LLMs

Source: OWASP

Subscribe to Bright newsletter!