Skip to content

Path Traversal

Overview

Paths constructed from user-controlled data may allow an attacker to access unexpected resources, like application code, data, configuration files that may contain credentials, and sensitive OS files. Path traversals can lead to sensitive information being disclosed, modified or deleted.

Recommendation

Validate user input before using it to construct a file path. We strongly recommend using libraries included with your web framework to construct paths safely from user-controlled values. Examples include: - werkzeug.utils.secure_filename in python - ActiveStorage::Filenamenew(filename).sanitized in ruby - Thesanitize-filename package in node

If you need to do the validation yourself, we recommend the following: - Disallow more than a single "." character. - Disallow directory separators such as "/" or "\". - Disallow url-encoding characters like "%". - Avoid simply replacing problematic sequences such as ../. For example, after applying this filter to .../...//, the resulting string would still be ../. - Restrict the allowed characters in the user input to a limited character set, like [a-zA-Z0-9]+ for instance, if you can.

Importantly, do not process the user input in any way between validation and using the validated user-input to derive a path. Vulnerabilities often arise otherwise, for instance if the path is used in another service that would perform url decoding on the input.

After validating the user input, the application should append the input to the base directory and use a filesystem API to canonicalize the path. It should then verify that the canonicalized path starts with the expected directory, which would allow you to detect path traversal attacks missed by the validation:

File file = new File(BASE_DIRECTORY, user_input);
if(!file.getCanonicalPath().startsWith(BASE_DIRECTORY)) {
    // ... ☠️ Attempted attack. Log it, and return 400 Bad Request
}
// ... process file

Examples

Here's an example of an API that allows you to download cat pictures by specifying a filename in the query parameter. The API does not handle this safely, as we'll see shortly:

from fastapi import FastAPI, Response, HTTPException
import os

app = FastAPI()

@app.get("/cat-pix")
async def pix(filename: str):
    base_path = '/home/jon/pix'
    try:
        data = open(os.path.join(base_path, filename), 'rb').read()
        return Response(content=data, media_type="image/jpeg")
    except FileNotFoundError:
        raise HTTPException(status_code=404, detail="🙀 Cat not found")

If we try the API on non-malicious files, everything works as expected:

$ curl "localhost:8081/cat-pix?filename=garfield.jpg"
< ... jpg bytes >
$ curl "localhost:8081/cat-pix?filename=nermal.jpg"
{"detail":"🙀 Cat not found"}

However, a malicious input would allow an attack from reading any file on the filesystem:

$ curl "localhost:8081/cat-pix?filename=../../../etc/passwd"
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
< ... >

What happened here? The code ran os.path.join("/home/jon/pix", "../../../etc/passwd"), which because of the ../, results in /etc/passwd. Following the recommendation listed above, we can make the endpoint safe using:

@app.get("/cat-pix")
async def pix(filename: str):
    base_path = '/home/jon/pix'

    # Sanitize the input
    filename = werkzeug.utils.secure_filename(filename)
    # Compute and normalize the path
    filepath = os.path.realpath(os.path.join(base_path, filename))

    # Ensure that the computed path didn't escape our base path. This is for
    # extra safety and should never trigger if our validation is done properly
    if not filepath.startswith(base_path):
        # Attack detected!
        raise HTTPException(status_code=400, detail="☠️ Nice try!")

    try:
      data = open(filepath, 'rb').read()
    except FileNotFoundError:
        raise HTTPException(status_code=404, detail="🙀 Cat not found")
    return Response(content=data, media_type="image/jpeg")

The path used to access the file system is sanitized and normalized before being checked against a known prefix. This ensures that regardless of the user input, the resulting path is safe.

References