コンテンツにスキップ

SQL Injection

Overview

If a database query is built by concatenating strings, and the substrings are derived from user input, an attacker is likely to be able to run malicious database queries.

Recommendation

We recommend using SQL prepared statements instead of building a SQL query with string concatenation. A prepared statement usually includes a wildcard, written as a question mark (?), for each part of the SQL query that is expected to be replaced by variables. When the prepared statement is later executed, a value must be provided for each wildcard in the query.

Examples

In the following example, we have a simple FastAPI python authentication API. The API uses an in-memory sqlite database to store user credentials (with plaintext passwords 😱 -- don't do this at home!). The API exposes one endpoint /login, which takes two query parameters: email and password. The endpoint unsafely handles those parameters to build a SQL query using unsafe string formatting:

from fastapi import FastAPI
import sqlite3

app = FastAPI()
DB = sqlite3.connect(':memory:')

@app.on_event("startup")
async def init_db():
    cur = DB.cursor()
    cur.execute("CREATE TABLE users (email text, password text)")
    cur.execute("INSERT INTO users VALUES ('jholden@roci.space', 'tachi')")
    DB.commit()

@app.get("/login")
async def login(email: str, password: str):
    cur = DB.cursor()
    cur.execute("SELECT * FROM users WHERE email = '%s' AND password = '%s'" % (email, password))
    user = cur.fetchone()
    cur.close()
    if user:
        # ... Set cookie
        return '👋 Welcome back %s!' % (user[0],)
    return '🚨Bad credentials!'

The endpoint responds as expected on regular inputs:

$ curl "localhost:8081/login?email=test&password=please-ignore"
"🚨 Bad credentials!"
$ curl "localhost:8081/login?email=jholden@roci.space&password=tachi"
"👋 Welcome back jholden@roci.space!"

However, a specially crafted input allows an attacker to authenticate without knowing the password:

$ curl "localhost:8081/login?email=attacker&password='OR'1'='1"
"👋 Welcome back jholden@roci.space!"

Wait, the endpoint logged the attacker as another user without their password? What happened? Well, we built our SQL query using string formatting. If you put the attacker input in our SQL query, you get:

SELECT * FROM users WHERE email = 'attacker' AND password = '1'OR'1'='1'

Because of the OR '1'='1', the WHERE clause is always true, and the query will return all the rows in the table! In our case, there's only one user in our database, so that is the user that the attacker hacks. Note that using SQL injection, attackers usually can exfiltrate the entire database contents, leading to a potentially costly data leak.

So how can we fix the injection in this example? With prepared statements:

cur.execute('SELECT * FROM users WHERE email = ? AND password = ?', (email, password))

This is a bit subtle. It looks like we just replaced '%s' by ?. It still looks like a format string, so why is this one safe and the other unsafe? The answer is that the code that takes care of replacing the ? characters, which is either the database itself or the library you are using, is sanitizing the variables properly to ensure that there cannot be a SQL injection. If you try the malicious request again, you'll see that it doesn't work anymore.

References