LLM APIs Vulnerabilities
Discover Pynt's documentation on security tests for LLM (Large Language Model) injections! Learn how Pynt fortifies your APIs against LLM-based vulnerabilities.
At a Glance: LLM vulnerabilities, such as Prompt Injection and Insecure Output Handling, occur when untrusted data is sent to or from a language model without proper validation or sanitization. Attackers can exploit these vulnerabilities to manipulate the behavior of the LLM, extract sensitive data, or execute malicious commands in downstream systems.
What are the common mistakes made by developers?
LLM-based attacks happen when input from an API request is used directly in an LLM prompt or when outputs from an LLM are used directly in operations such as rendering HTML content, executing system calls, or interfacing with other services, without validating whether the input or output includes unwanted content.
How can I fix LLM Injection issues?
Prompt Injection
To prevent prompt injection attacks, consider the following measures:
Input Sanitization: Always sanitize and validate user input before incorporating it into LLM prompts. Remove or escape any tokens or phrases that could alter the intended behavior of the LLM.
Use Contextual Prompts: Structure prompts to minimize the impact of injected content. For example, clearly separate system instructions from user-provided content.
Limit LLM Instructions: Avoid including system-level instructions in prompts that can be manipulated by user input.
Insecure Output Handling
Insecure Output Handling refers to the lack of adequate validation, sanitization, and management of outputs generated by large language models before they are passed downstream to other components and systems.
Exploiting an Insecure Output Handling vulnerability can lead to XSS, CSRF, and Markdown injection attacks in web browsers, as well as SSRF, privilege escalation, or remote code execution on backend systems.
Example of Insecure Output Handling:
Consider an API that uses an LLM to generate HTML content based on user input:
If the LLM generates malicious JavaScript code within content
, it could lead to an XSS attack when rendered in the user's browser.
How to Fix Insecure Output Handling:
Sanitize LLM Outputs: Use libraries to escape or remove potentially harmful content before rendering.
Set Content Security Policies: Implement CSP headers to restrict the execution of scripts and other potentially dangerous content.
Fixed Code Example:
Markdown Injection
Markdown content generated by LLMs can include embedded HTML or JavaScript, leading to XSS attacks when rendered.
Vulnerable Example:
An attacker could manipulate the LLM to generate malicious Markdown that includes scripts.
Fixing Markdown Injection:
Use Safe Markdown Renderers: Utilize Markdown libraries that sanitize HTML content.
Sanitize After Rendering: Escape or remove any HTML tags after converting from Markdown.
Fixed Code Example:
Server-Side Request Forgery (SSRF) via LLM
An LLM might generate URLs or network requests based on user input, potentially leading to SSRF attacks.
Vulnerable Example:
An attacker could manipulate the LLM to generate internal URLs, causing the server to make requests to internal services.
Preventing SSRF:
Validate and Sanitize URLs: Ensure that the generated URLs point to allowed domains.
Implement Network Policies: Restrict the server's network access to prevent unauthorized requests.
Fixed Code Example:
By implementing these measures, you can significantly reduce the risk of LLM-related vulnerabilities in your APIs. Always treat both the input to and output from LLMs with the same caution as you would with any untrusted data.
These test cases detect LLM APIs vulnerabilities:
Test case | OWASP | CWE |
---|---|---|
[LLM001] Direct prompt injection | ||
[LLM002] Prompt injection, alignment | ||
[LLM003] Insecure output handling, type: XSS | ||
[LLM004] Insecure output handling, type: SSRF | ||
[LLM005] Insecure output handling, type: Markdown |
Last updated