What is Path Traversal (Directory Traversal)? Ways to Exploit, Examples and Impact
Discover how Path Traversal vulnerabilities work, see real-world exploitation examples, and learn best practices for prevention and mitigation.
Path Traversal, also known as Directory Traversal or the "dot-dot-slash" attack, is a critical web security vulnerability that allows an attacker to read arbitrary files on the server that is running an application. This might include application source code, configuration files containing database credentials, or sensitive operating system files. In some cases, an attacker might even be able to write to arbitrary files on the server, allowing them to modify application data or behavior and ultimately take full control of the server.
At its core, Path Traversal occurs when an application uses user-controllable input to construct a file path in an insecure way. By manipulating this input, an attacker can trick the application into accessing files outside of the intended directory. Understanding this vulnerability is essential for any developer or security professional, as it remains a common entry point for data breaches and system compromises.
How Path Traversal Works
To understand Path Traversal, we first need to look at how file systems handle paths. Most operating systems use a hierarchical directory structure. To navigate this structure, we use specific characters:
/(forward slash) or\(backslash): Used as directory separators..(single dot): Represents the current directory...(double dot): Represents the parent directory.
In a typical web application, the developer might intend for users to access files within a specific folder, such as /var/www/app/public/images/. If the application takes a filename as a parameter, like display.php?file=logo.png, the backend code might look like this:
$filename = $_GET['file'];
include('/var/www/app/public/images/' . $filename);
Under normal circumstances, this works fine. However, if an attacker provides a manipulated string like ../../../../etc/passwd, the resulting path becomes /var/www/app/public/images/../../../../etc/passwd. On Unix-like systems, the .. sequence moves up one level. By repeating this sequence, the attacker can traverse all the way to the root directory and then descend into /etc/passwd, a sensitive file containing user account information.
Relative vs. Absolute Paths
There are two main ways to reference files: relative paths and absolute paths. Path Traversal usually exploits relative paths.
- Relative Path: A path relative to the current working directory (e.g.,
images/photo.jpg). - Absolute Path: A full path starting from the root of the file system (e.g.,
/home/user/documents/file.txtorC:\Windows\System32\drivers\etc\hosts).
If an application does not properly validate that the user input stays within the "base" directory, the attacker can use relative path sequences (../) to "break out" of the restricted environment.
Common Targets and Sensitive Files
When performing a Path Traversal attack, hackers look for specific files that provide high-value information. Depending on the operating system, these targets vary.
Linux/Unix Systems
/etc/passwd: Contains a list of system users./etc/shadow: Contains encrypted passwords (usually requires root access)./etc/hosts: Contains local DNS mappings.~/.bash_history: Command history for the current user./proc/self/environ: Environment variables, which might include API keys or session secrets.- Web application configuration files:
config.php,settings.py,.env.
Windows Systems
C:\Windows\win.ini: A standard initialization file often used to test for traversal.C:\Windows\System32\drivers\etc\hosts: Local DNS settings.C:\Users\<user>\Desktop: User-specific files.C:\inetpub\wwwroot\web.config: Configuration for IIS web servers.
Examples of Path Traversal Vulnerabilities
Let's look at a few practical scenarios where Path Traversal might manifest in a real-world application.
Example 1: Image Loading via Query Parameters
Consider a website that displays user profile pictures. The URL might look like this:https://example.com/view_image?name=user123.jpg
The server-side code (in Node.js/Express) might look like this:
const path = require('path');
const fs = require('fs');
app.get('/view_image', (req, res) => {
const fileName = req.query.name;
const filePath = path.join(__dirname, 'uploads', fileName);
fs.readFile(filePath, (err, data) => {
if (err) {
res.status(404).send('File not found');
} else {
res.send(data);
}
});
});
An attacker can change the query to:https://example.com/view_image?name=../../../../etc/passwd
The path.join function will resolve this to the actual file on the disk, and fs.readFile will serve the contents of /etc/passwd to the attacker's browser.
Example 2: File Download Functionality
Many applications allow users to download reports or documents. A vulnerable implementation might use a direct file path:
# Python/Flask Example
@app.route('/download')
def download_file():
doc_name = request.args.get('filename')
return send_file(f"/app/storage/reports/{doc_name}")
By requesting /download?filename=../../../../app/.env, the attacker could steal the application's environment variables, potentially gaining access to database passwords or third-party API keys.
Advanced Exploitation Techniques (Bypassing Protections)
Developers often try to implement basic filters to prevent Path Traversal, but these can often be bypassed by clever attackers.
1. Nested Traversal Sequences
If a developer tries to strip ../ from the input using a simple find-and-replace, an attacker can use nested sequences. For example, if the code removes ../, the attacker can provide:....// or ..././
When the filter removes the inner ../, the remaining characters collapse into a valid ../ sequence.
2. URL Encoding and Double Encoding
Web servers often decode URL parameters. An attacker can bypass filters that look for literal dots and slashes by using URL encoding:
.becomes%2e/becomes%2f\becomes%5c
A payload like %2e%2e%2f%2e%2e%2fetc%2fpasswd might bypass a filter that only looks for ../. If the application decodes the input twice (double encoding), the attacker can use %252e%252e%252f, where %25 is the encoding for the % character itself.
3. Null Byte Injection
In older versions of languages like PHP (before 5.3.4), a null byte (%00) could be used to terminate a string. If an application appends a file extension to user input, like this:
$file = $_GET['file'] . ".jpg";
include("/var/www/images/" . $file);
An attacker could request ../../etc/passwd%00. The application would see the string as ../../etc/passwd%00.jpg, but the underlying C API used by the operating system would stop reading at the null byte, effectively accessing /etc/passwd.
4. Absolute Path Bypass
Sometimes, an application checks if the input starts with a specific directory. If it doesn't, it might prepend the directory. However, if the application accepts an absolute path directly, it might ignore the base directory entirely. For instance, if the input is /etc/passwd, the application might try to open that exact file instead of appending it to /var/www/.
Real-World Impact of Path Traversal
The impact of a successful Path Traversal attack can be devastating. Here are the primary risks:
- Information Disclosure: This is the most common impact. Attackers can read sensitive system files, application source code (which may contain further vulnerabilities), and configuration files.
- Remote Code Execution (RCE): If an attacker can read log files (like Apache or Nginx access logs), they can perform "Log Poisoning." By sending a request containing malicious code (e.g., a PHP shell) and then using Path Traversal to "include" that log file, the server will execute the code.
- Full System Compromise: By obtaining SSH keys or database credentials, an attacker can move laterally through the network and gain persistent access to the entire infrastructure.
- Data Modification: If the vulnerability allows file writing (e.g., an insecure file upload or save feature), an attacker can overwrite critical system files or inject malicious scripts into the web root.
How to Detect Path Traversal
Detecting these vulnerabilities requires a combination of manual and automated testing.
Manual Testing
- Identify all parameters that appear to interact with the file system (e.g.,
file,path,dir,image,doc). - Attempt to inject basic traversal sequences like
../and observe the response. - Try different depths (e.g.,
../../../etc/passwd) to account for varying directory structures. - Test for bypasses using URL encoding, null bytes, and different directory separators (
/vs\).
Automated Scanning
Using a vulnerability scanner can help identify common Path Traversal points. However, many scanners miss complex bypasses. To proactively monitor your organization's external attack surface and catch exposures before attackers do, try Jsmon. Jsmon provides continuous reconnaissance that can help identify hidden endpoints and parameters that might be susceptible to traversal attacks.
Prevention and Mitigation Strategies
Preventing Path Traversal requires a defense-in-depth approach. Here are the most effective strategies:
1. Avoid Passing User Input Directly to File APIs
The best way to prevent Path Traversal is to avoid using user input to construct file paths altogether. Instead, use an indirect reference. For example, store files in a database with a unique ID and have users request the ID instead of the filename.
2. Use Allowlisting
If you must use user input for filenames, validate it against a strict allowlist of permitted characters (e.g., alphanumeric only). Do not allow dots or slashes unless absolutely necessary.
3. Canonicalization and Path Validation
Before using a path, "canonicalize" it. This means resolving all relative sequences and symbolic links to get the absolute, simplest version of the path. Most modern programming languages have built-in functions for this:
- Node.js:
path.resolve() - Java:
getCanonicalPath() - PHP:
realpath() - Python:
os.path.abspath()
After canonicalizing the path, check if it still starts with the intended base directory.
// Secure Example in Node.js
const rootDir = path.join(__dirname, 'uploads');
const requestedPath = path.join(rootDir, userInput);
if (!requestedPath.startsWith(rootDir)) {
throw new Error("Access Denied");
}
4. Filesystem Permissions
Run your web application with the least privilege necessary. The web server user should not have read access to sensitive system files like /etc/shadow or configuration files belonging to other users. Using containers or "chroot jails" can also limit the scope of what an attacker can see even if they find a traversal vulnerability.
Conclusion
Path Traversal is a deceptively simple vulnerability that can lead to total system compromise. By understanding the mechanics of how file paths are handled and how attackers bypass basic filters, developers can build more resilient applications. Remember that input validation is never a substitute for proper architectural design—whenever possible, avoid letting user input touch the file system directly.
Staying ahead of attackers requires constant vigilance and a clear view of your infrastructure. To proactively monitor your organization's external attack surface and catch exposures before attackers do, try Jsmon.