Password auth in Rust, from scratch - Attacks and best practices

This article is a sample from Zero To Production In Rust, a book on backend development in Rust.
You can get a copy of the book on zero2prod.com.
Subscribe to the newsletter to be notified when a new episode is published.

1. Securing Our API

In Chapter 9 we added a new endpoint to our API - POST /newsletters.
It takes a newsletter issue as input and sends emails out to all our subscribers.

We have an issue though - anybody can hit the API and broadcast whatever they want to our entire mailing list.

It is time to level up our API security toolbox.
Password authentication is often seen as the simplest auth method, but there are plenty of pitfalls along the way. We will implement Basic auth from scratch, examining several classes of attacks against our API - and how to counter them.

This chapter, like others in the book, chooses to "do it wrong" first for teaching purposes. Make sure to read until the end if you don't want to pick up bad security habits!

Chapter 10 - Part 0

  1. Securing Our API
  2. Authentication
  3. Password-based Authentication
  4. Is it safe?
  5. What Should We Do Next

2. Authentication

We need a way to verify who is calling POST /newsletters.
Only a handful of people, the ones in charge of the content, should be able to send emails out to the entire mailing list.

We need to find a way to verify the identity of API callers - we must authenticate them.
How?

By asking for something they are uniquely positioned to provide.
There are various approaches, but they all boil down to three categories:

  1. Something they know (e.g. passwords, PINs, security questions);
  2. Something they have (e.g. a smartphone, using an authenticator app);
  3. Something they are (e.g. fingerprints, Apple's Face ID).

Each approach has its weaknesses.

2.1. Drawbacks

2.1.1. Something They Know

Passwords must be long - short ones are vulnerable to brute-force attacks.
Passwords must be unique - publicly available information (e.g. date of birth, names of family members, etc.) should not give an attacker any chance to "guess" a password.
Passwords should not be reused across multiple services - if any of them gets compromised you risk granting access to all the other services sharing the same password.

On average, a person has 100 or more online accounts - they cannot be asked to remember hundreds of long unique passwords by heart.
Password managers help, but they are not mainstream yet and the user experience is often sub-optimal.

2.1.2. Something They Have

Smartphones and U2F keys can be lost, locking the user out of their accounts.
They can also be stolen or compromised, giving an attacker a window of opportunity to impersonate the victim.

2.1.3. Something They Are

Biometrics, unlike passwords, cannot be changed - you cannot "rotate" your fingerprint or change the pattern of your retina's blood vessel.
Forging a fingerprint turns out to be easier than most would imagine - it is also information often available to government agencies who might abuse it or lose it.

2.2. Multi-factor Authentication

What should we do then, given that each approach has its own flaws?
Well, we could combine them!

That is pretty much what multi-factor authentication (MFA) boils down to - it requires the user to provide at least two different types of authentication factors in order to get access.

3. Password-based Authentication

Let's jump from theory to practice: how do we implement authentication?

Passwords look like the simplest approach among the three we mentioned.
How should we pass a username and a password to our API?

3.1. Basic Authentication

We can use the 'Basic' Authentication Scheme, a standard defined by the Internet Engineering Task Force (IETF) in RFC 2617 and later updated by RFC 7617.

The API must look for the Authorization header in the incoming request, structured as follows:

Authorization: Basic <encoded credentials>

where <encoded credentials> is the base64-encoding of {username}:{password}1.

According to the specification, we need to partition our API into protection spaces or realms - resources within the same realm are protected using the same authentication scheme and set of credentials.
We only have a single endpoint to protect - POST /newsletters. We will therefore have a single realm, named publish.

The API must reject all requests missing the header or using invalid credentials - the response must use the 401 Unauthorized status code and include a special header, WWW-Authenticate, containing a challenge.
The challenge is a string explaining to the API caller what type of authentication scheme we expect to see for the relevant realm.
In our case, using basic authentication, it should be:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: Basic realm="publish"

Let's implement it!

3.1.1. Extracting Credentials

Extracting username and password from the incoming request will be our first milestone.

Let's start with an unhappy case - an incoming request without an Authorization header is rejected.

//! tests/api/newsletters.rs
// [...]

#[actix_rt::test]
async fn requests_missing_authorization_are_rejected() {
    // Arrange
    let app = spawn_app().await;

    let response = reqwest::Client::new()
        .post(&format!("{}/newsletters", &app.address))
        .json(&serde_json::json!({
            "title": "Newsletter title",
            "content": {
                "text": "Newsletter body as plain text",
                "html": "<p>Newsletter body as HTML</p>",
            }
        }))
        .send()
        .await
        .expect("Failed to execute request.");

    // Assert
    assert_eq!(401, response.status().as_u16());
    assert_eq!(r#"Basic realm="publish""#, response.headers()["WWW-Authenticate"]);
}

It fails at the first assertion:

thread 'newsletter::requests_missing_authorization_are_rejected' panicked at 
'assertion failed: `(left == right)`
  left: `401`,
 right: `400`'

We must update our handler to fulfill the new requirements.
We can use the web::HttpRequest extractor to reach the headers associated with the incoming request:

//! src/routes/newsletters.rs
// [...]
use actix_web::http::{HeaderMap, StatusCode};

pub async fn publish_newsletter(
    // [...]
    // New extractor!
    request: web::HttpRequest,
) -> Result<HttpResponse, PublishError> {
    let _credentials = basic_authentication(request.headers());
    // [...]
}

struct Credentials {
    username: String,
    password: String,
}

fn basic_authentication(headers: &HeaderMap) -> Result<Credentials, anyhow::Error> {
    todo!()
}

To extract the credentials we will need to deal with the base64 encoding.
Let's add the base64 crate as a dependency:

[dependencies]
# [...]
base64 = "0.13"

We can now write down the body of basic_authentication:

//! src/routes/newsletters.rs
// [...]

fn basic_authentication(headers: &HeaderMap) -> Result<Credentials, anyhow::Error> {
    // The header value, if present, must be a valid UTF8 string
    let header_value = headers
        .get("Authorization")
        .context("The 'Authorization' header was missing")?
        .to_str()
        .context("The 'Authorization' header was not a valid UTF8 string.")?;
    let base64encoded_segment = header_value
        .strip_prefix("Basic ")
        .context("The authorization scheme was not 'Basic'.")?;
    let decoded_bytes = base64::decode_config(base64encoded_segment, base64::STANDARD)
        .context("Failed to base64-decode 'Basic' credentials.")?;
    let decoded_credentials = String::from_utf8(decoded_bytes)
        .context("The decoded credential string is not valid UTF8.")?;

    // Split into two segments, using ':' as delimitator
    let mut credentials = decoded_credentials.splitn(2, ':');
    let username = credentials
        .next()
        .ok_or_else(|| anyhow::anyhow!("A username must be provided in 'Basic' auth."))?
        .to_string();
    let password = credentials
        .next()
        .ok_or_else(|| anyhow::anyhow!("A password must be provided in 'Basic' auth."))?
        .to_string();

    Ok(Credentials { username, password })
}

Take a moment to go through the code, line by line, and fully understand what is happening. Many operations that could go wrong!
Having the RFC open, side to side with the book, helps!

We are not done yet - our test is still failing.
We need to act on the error returned by basic_authentication:

//! src/routes/newsletters.rs
// [...]

#[derive(thiserror::Error)]
pub enum PublishError {
    // New error variant!
    #[error("Authentication failed.")]
    AuthError(#[source] anyhow::Error),
    #[error(transparent)]
    UnexpectedError(#[from] anyhow::Error),
}

impl ResponseError for PublishError {
    fn status_code(&self) -> StatusCode {
        match self {
            PublishError::UnexpectedError(_) => StatusCode::INTERNAL_SERVER_ERROR,
            // Return a 401 for auth errors
            PublishError::AuthError(_) => StatusCode::UNAUTHORIZED,
        }
    }
}


pub async fn publish_newsletter(/* */) -> Result<HttpResponse, PublishError> {
    let _credentials = basic_authentication(request.headers())
        // Bubble up the error, performing the necessary conversion
        .map_err(PublishError::AuthError)?;
    // [...]
}

Our status code assertion is now happy, the header one not yet:

thread 'newsletter::requests_missing_authorization_are_rejected' panicked at 
'no entry found for key "WWW-Authenticate"'

So far it has been enough to specify which status code to return for each error - now we need something more, a header.
We need to change our focus from ResponseError::status_code to ResponseError::error_response:

//! src/routes/newsletters.rs
// [...]
use actix_web::http::{HeaderMap, HeaderValue, StatusCode, header};

impl ResponseError for PublishError {
    fn error_response(&self) -> HttpResponse {
        match self {
            PublishError::UnexpectedError(_) => {
                HttpResponse::new(StatusCode::INTERNAL_SERVER_ERROR)
            }
            PublishError::AuthError(_) => {
                let mut response = HttpResponse::new(StatusCode::UNAUTHORIZED);
                let header_value = HeaderValue::from_str(r#"Basic realm="publish""#)
                    .unwrap();
                response
                    .headers_mut()
                    // actix_web::http::header provides a collection of constants
                    // for the names of several well-known/standard HTTP headers
                    .insert(header::WWW_AUTHENTICATE, header_value);
                response
            }
        }
    }
    
    // `status_code` is invoked by the default `error_response`
    // implementation. We are providing a bespoke `error_response` implementation
    // therefore there is no need to maintain a `status_code` implementation anymore.
}

Our authentication test passes!
A few of the old ones are broken though:

test newsletter::newsletters_are_not_delivered_to_unconfirmed_subscribers ... FAILED
test newsletter::newsletters_are_delivered_to_confirmed_subscribers ... FAILED

thread 'newsletter::newsletters_are_not_delivered_to_unconfirmed_subscribers' 
panicked at 'assertion failed: `(left == right)`
  left: `401`,
 right: `200`'

thread 'newsletter::newsletters_are_delivered_to_confirmed_subscribers' 
panicked at 'assertion failed: `(left == right)`
  left: `401`,
 right: `200`'

POST /newsletters is now rejecting all unauthenticated requests, including the ones we were making in our happy-path black-box tests.
We can stop the bleeding by providing a random combination of username and password:

//! tests/api/helpers.rs
// [...]

impl TestApp {
    pub async fn post_newsletters(&self, body: serde_json::Value) -> reqwest::Response {
        reqwest::Client::new()
            .post(&format!("{}/newsletters", &self.address))
            // Random credentials!
            // `reqwest` does all the encoding/formatting heavy-lifting for us.
            .basic_auth(Uuid::new_v4().to_string(), Some(Uuid::new_v4().to_string()))
            .json(&body)
            .send()
            .await
            .expect("Failed to execute request.")
    }
    
    // [...]
}

The test suite is green again.

3.2. Password Verification - Naive Approach

An authentication layer that accepts random credentials is... not ideal.
We need to start validating the credentials we are extracting from the Authorization header - they should be compared to a list of known users.

We will create a new users Postgres table to store this list:

sqlx migrate add create_users_table

A first draft for the schema might look like this:

-- migrations/20210815112026_create_users_table.sql 
CREATE TABLE users(
   user_id uuid PRIMARY KEY,
   username TEXT NOT NULL UNIQUE,
   password TEXT NOT NULL
);

We can then update our handler to query it every time we perform authentication:

//! src/routes/newsletters.rs
// [...]

async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    let user_id: Option<_> = sqlx::query!(
        r#"
        SELECT user_id
        FROM users
        WHERE username = $1 AND password = $2
        "#,
        credentials.username,
        credentials.password
    )
    .fetch_optional(pool)
    .await
    .context("Failed to perform a query to validate auth credentials.")
    .map_err(PublishError::UnexpectedError)?;

    user_id
        .map(|row| row.user_id)
        .ok_or_else(|| anyhow::anyhow!("Invalid username or password."))
        .map_err(PublishError::AuthError)
}

pub async fn publish_newsletter(/* */) -> Result<HttpResponse, PublishError> {
    let credentials = basic_authentication(request.headers())
        .map_err(PublishError::AuthError)?;
    let user_id = validate_credentials(credentials, &pool).await?;
    // [...]
}

It would be a good idea to record who is calling POST /newsletters - let's add a tracing span around our handler:

//! src/routes/newsletters.rs
// [...]

#[tracing::instrument(
    name = "Publish a newsletter issue",
    skip(body, pool, email_client, request),
    fields(username=tracing::field::Empty, user_id=tracing::field::Empty)
)]
pub async fn publish_newsletter(/* */) -> Result<HttpResponse, PublishError> {
    let credentials = basic_authentication(request.headers())
        .map_err(PublishError::AuthError)?;
    tracing::Span::current().record(
        "username",
        &tracing::field::display(&credentials.username)
    );
    let user_id = validate_credentials(credentials, &pool).await?;
    tracing::Span::current().record("user_id", &tracing::field::display(&user_id));
    // [...]
}

We now need to update our happy-path tests to specify a username-password pair that is accepted by validate_credentials.
We will generate a test user for every instance of our test application. We have not yet implemented a sign-up flow for newsletter editors, therefore we cannot go for a fully black-box approach - for the time being we will inject the test user details directly into the database:

//! tests/api/helpers.rs
// [...]

pub async fn spawn_app() -> TestApp {
    // [...]

    let test_app = TestApp {/* */};
    add_test_user(&test_app.db_pool).await;
    test_app
}


async fn add_test_user(pool: &PgPool) {
    sqlx::query!(
        "INSERT INTO users (user_id, username, password)
        VALUES ($1, $2, $3)",
        Uuid::new_v4(),
        Uuid::new_v4().to_string(),
        Uuid::new_v4().to_string(),
    )
    .execute(pool)
    .await
    .expect("Failed to create test users.");
}

TestApp will provide a helper method to retrieve its username and password

//! tests/api/helpers.rs
// [...]

impl TestApp {
    // [...]

    pub async fn test_user(&self) -> (String, String) {
        let row = sqlx::query!("SELECT username, password FROM users LIMIT 1",)
            .fetch_one(&self.db_pool)
            .await
            .expect("Failed to create test users.");
        (row.username, row.password)
    }
}

which we will then be calling from our post_newsletters method, instead of using random credentials:

//! tests/api/helpers.rs
// [...]

impl TestApp {
    // [...]

    pub async fn post_newsletters(&self, body: serde_json::Value) -> reqwest::Response {
        let (username, password) = self.test_user().await;
        reqwest::Client::new()
            .post(&format!("{}/newsletters", &self.address))
            // No longer randomly generated on the spot!
            .basic_auth(username, Some(password))
            .json(&body)
            .send()
            .await
            .expect("Failed to execute request.")
    }
}

All our tests are passing now.

3.3. Password Storage

Storing raw user passwords in your database is not a good idea.

An attacker with access to your stored data can immediately start impersonating your users - both usernames and passwords are ready to go.
They don't even have to compromise your live database - an unencrypted backup is enough.

3.3.1. No Need To Store Raw Passwords

Why are we even storing passwords in the first place?
We need to perform an equality check - every time a user tries to authenticate we verify that the password they provided matches the password we were expecting.

If equality is all we care about, we can start devising a more sophisticated strategy.
We could, for example, transform the passwords by applying a function before comparing them.

All deterministic functions return the same output given the same input.
Let f be our deterministic function: psw_candidate == expected_psw implies f(psw_candidate) == f(expected_psw).
This is not enough though - what if f returned hello for every possible input string? Password verification would succeed no matter the input provided.

We need to go in the opposite direction: if f(psw_candidate) == f(expected_psw) then psw_candidate == expected_psw.
This is possible assuming that our function f has an additional property: it must be injective - if x != y then f(x) != f(y).

If we had such a function f, we could avoid storing the raw password altogether: when a user signs up, we compute f(password) and store it in our database. password is discarded.
When the same user tries to sign in, we compute f(psw_candidate) and check that it matches the f(password) value we stored during sign-up. The raw password is never persisted.

Does this actually improve our security posture?
It depends on f!

It is not that difficult to define an injective function - the reverse function, f("hello") = "olleh", satisfies our criteria. It is equally easy to guess how to invert the transformation to recover the original password - it doesn't hinder an attacker.
We could make the transformation a lot more complicated - complicated enough to make it cumbersome for an attacker to find the inverse transformation.
Even that might not be enough. It is often sufficient for an attacker to be able to recover some properties of the input (e.g. length) from the output to mount, for example, a targeted brute-force attack.
We need something stronger - there should be no relationship between how similar two inputs x and y are and how similar the corresponding outputs f(x) and f(y) are.

We want a cryptographic hash function.
Hash functions map strings from the input space to fixed-length outputs.
The adjective cryptographic refers to the uniformity property we were just discussing, also known as avalanche effect: a tiny difference in inputs leads to outputs so different to the point of looking uncorrelated.

There is a caveat: hash functions are not injective2, there is a tiny risk of collisions - if f(x) == f(y) there is a high probability (not 100%!) that x == y.

3.3.2. Using A Cryptographic Hash

Enough with the theory - let's update our implementation to hash passwords before storing them.

There are several cryptographic hash functions out there - MD5, SHA-1, SHA-2, SHA-3, KangarooTwelve, etc.
We are not going to delve deep into the pros and cons of each algorithm - it is pointless when it comes to passwords, for reasons that will become clear in a few pages.
For the sake of this section, let's move forward with SHA-3, the latest addition to the Secure Hash Algorithms family.

On top of the algorithm, we also need to choose the output size - e.g. SHA3-224 uses the SHA-3 algorithm to produce a fixed-sized output of 224 bits.
The options are 224, 256, 384 and 512. The longer the output, the more unlikely we are to experience a collision. On the flip side, we will need more storage and consume more bandwidth by using longer hashes.
SHA3-256 should be more than enough for our usecase.

The Rust Crypto organization provides an implementation of SHA-3, the sha3 crate. Let's add it to our dependencies:

#! Cargo.toml
#! [...]

[dependencies]
# [...]
sha3 = "0.9"

For clarity, let's rename our password column to password_hash:

sqlx migrate add rename_password_column
-- migrations/20210815112028_rename_password_column.sql
ALTER TABLE users RENAME password TO password_hash;

Our project should stop compiling:

error: error returned from database: column "password" does not exist
  --> src/routes/newsletters.rs
   |
90 |       let user_id: Option<_> = sqlx::query!(
   |  ______________________________^
91 | |         r#"
92 | |         SELECT user_id
93 | |         FROM users
...  |
97 | |         credentials.password
98 | |     )
   | |_____^

sqlx::query! spotted that one of our queries is using a column that no longer exists in the current schema.
Compile-time verification of SQL queries is quite neat, isn't it?

Our validate_credentials function looks like this:

//! src/routes/newsletters.rs
//! [...]

async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    let user_id: Option<_> = sqlx::query!(
        r#"
        SELECT user_id
        FROM users
        WHERE username = $1 AND password = $2
        "#,
        credentials.username,
        credentials.password
    )
    // [...]
}

Let's update it to work with hashed passwords:

//! src/routes/newsletters.rs
//! [...]
use sha3::Digest;

async fn validate_credentials(/* */) -> Result<uuid::Uuid, PublishError> {
    let password_hash = sha3::Sha3_256::digest(credentials.password.as_bytes());
    let user_id: Option<_> = sqlx::query!(
        r#"
        SELECT user_id
        FROM users
        WHERE username = $1 AND password_hash = $2
        "#,
        credentials.username,
        password_hash
    )
    // [...]
}

Unfortunately, it will not compile straight away:

error[E0308]: mismatched types
  --> src/routes/newsletters.rs:99:9
   |
99 |         password_hash
   |         ^^^^^^^^^^^^^ expected `&str`, found struct `GenericArray`
   |
   = note: expected reference `&str`
                 found struct `GenericArray<u8, UInt<..>>`

Digest::digest returns a fixed-length array of bytes, while our password_hash column is of type TEXT, a string.
We could change the schema of the users table to store password_hash as binary. Alternatively, we can encode the bytes returned by Digest::digest in a string using the hexadecimal format.

Let's spare ourselves another migration by using the second option:

//! [...]

async fn validate_credentials(/* */) -> Result<uuid::Uuid, PublishError> {
    let password_hash = sha3::Sha3_256::digest(credentials.password.as_bytes());
    // Lowercase hexadecimal encoding.
    let password_hash = format!("{:x}", password_hash);
    // [...]
}

The application code should compile now. The test suite, instead, requires a bit more work.
The test_user helper method was recovering a set of valid credentials by querying the users table - this is no longer viable now that we are storing hashes instead of raw passwords!

//! tests/api/helpers.rs
//! [...]
 
impl TestApp {
    // [...]
    
    pub async fn test_user(&self) -> (String, String) {
        let row = sqlx::query!("SELECT username, password FROM users LIMIT 1",)
            .fetch_one(&self.db_pool)
            .await
            .expect("Failed to create test users.");
        (row.username, row.password)
    }
}

pub async fn spawn_app() -> TestApp {
    // [...]
    let test_app = TestApp {/* */};
    add_test_user(&test_app.db_pool).await;
    test_app
}

async fn add_test_user(pool: &PgPool) {
    sqlx::query!(
        "INSERT INTO users (user_id, username, password)
        VALUES ($1, $2, $3)",
        Uuid::new_v4(),
        Uuid::new_v4().to_string(),
        Uuid::new_v4().to_string(),
    )
    .execute(pool)
    .await
    .expect("Failed to create test users.");
}

We need TestApp to store the randomly generated password in order for us to access it in our helper methods.
Let's start by creating a new helper struct, TestUser:

//! tests/api/helpers.rs
//! [...]
use sha3::Digest;

pub struct TestUser {
    pub user_id: Uuid,
    pub username: String,
    pub password: String
}

impl TestUser {
    pub fn generate() -> Self {
        Self {
            user_id: Uuid::new_v4(),
            username: Uuid::new_v4().to_string(),
            password: Uuid::new_v4().to_string()
        }
    }

    async fn store(&self, pool: &PgPool) {
        let password_hash = sha3::Sha3_256::digest(self.password.as_bytes());
        let password_hash = format!("{:x}", password_hash);
        sqlx::query!(
            "INSERT INTO users (user_id, username, password_hash)
            VALUES ($1, $2, $3)",
            self.user_id,
            self.username,
            password_hash,
        )
        .execute(pool)
        .await
        .expect("Failed to store test user.");
    }
}

We can then attach an instance of TestUser to TestApp, as a new field:

//! tests/api/helpers.rs
//! [...]

pub struct TestApp {
    // [...]
    test_user: TestUser
}

pub async fn spawn_app() -> TestApp {
    // [...]
    let test_app = TestApp {
        // [...]
        test_user: TestUser::generate()
    };
    test_app.test_user.store(&test_app.db_pool).await;
    test_app
}

To finish, let's delete add_test_user, TestApp::test_user and update TestApp::post_newsletters:

//! tests/api/helpers.rs
//! [...]

impl TestApp {
    // [..]
    pub async fn post_newsletters(&self, body: serde_json::Value) -> reqwest::Response {
        reqwest::Client::new()
            .post(&format!("{}/newsletters", &self.address))
            .basic_auth(&self.test_user.username, Some(&self.test_user.password))
            // [...]
    }
}

The test suite should now compile and run successfully.

3.3.3. Preimage Attack

Is SHA3-256 enough to protect our users' passwords if an attacker gets their hands on our users table?

Let's imagine that the attack wants to crack a specific password hash in our database.
The attacker does not even need to retrieve the original password. To authenticate successfully they just need to find an input string s whose SHA3-256 hash matches the password they are trying to crack - in other words, a collision.
This is known as a preimage attack.

How hard is it?

The math is a bit tricky, but a brute-force attack has an exponential time complexity - 2^n, where n is the hash length in bits.
If n > 128, it is considered unfeasible to compute.
Unless a vulnerability is found in SHA-3, we do not need to worry about preimage attacks against SHA3-256.

3.3.4. Naive Dictionary Attack

We are not hashing arbitrary inputs though - we can reduce the search space by making some assumptions on the original password: how long was it? What symbols were used?
Let's imagine we are looking for an alphanumeric password that is shorter than 17 characters3.

We can count the number of password candidates:

// (26 letters + 10 number symbols) ^ Password Length
// for all allowed password lengths
36^1 +
36^2 +
... +
36^16 

It sums up to roughly 8 * 10^24 possibilities.
I wasn't able to find data specifically on SHA3-256, but researchers managed to compute ~900 million SHA3-512 hashes per second using a Graphical Processing Unit (GPU).

Assuming a hash rate of ~10^9 per second, it would take us ~10^15 seconds to hash all password candidates. The approximate age of the universe is 4 * 10^17 seconds.
Even if were to parallelise our search using a million GPUs, it would still take ~10^9 seconds - roughly 30 years4.

3.3.5. Dictionary Attack

Let's go back to what we discussed at the very beginning of this chapter - it is impossible for a person to remember a unique password for hundreds of online services.
Either they rely on a password manager, or they are re-using one or more passwords across multiple accounts.

Furthermore, most passwords are far from being random, even when reused - common words, full names, dates, names of popular sport teams, etc.
An attacker could easily design a simple algorithm to generate thousands of plausible passwords - but they do not have to. They can look at a password dataset from one of the many security breaches from the last decade to find the most common passwords in the wild.

In a couple of minutes they can pre-compute the SHA3-256 hash of the most commonly used 10 million passwords. Then they start scanning our database looking for a match.

This is known as dictionary attack - and it's extremely effective.

All the cryptographic hash functions we mentioned so far are designed to be fast.
Fast enough to enable anybody to pull off a dictionary attack without having to use specialised hardware.

We need something much slower, but with the same set of mathematical properties of cryptographic hash functions.

3.3.6. Argon2

The Open Web Application Security Project (OWASP)5 provides useful guidance on safe password storage - with a whole section on how to choose the correct hashing algorithm:

  • Use Argon2id with a minimum configuration of 15 MiB of memory, an iteration count of 2, and 1 degree of parallelism.
  • If Argon2id is not available, use bcrypt with a work factor of 10 or more and with a password limit of 72 bytes.
  • For legacy systems using scrypt, use a minimum CPU/memory cost parameter of (2^16), a minimum block size of 8 (1024 bytes), and a parallelization parameter of 1.
  • If FIPS-140 compliance is required, use PBKDF2 with a work factor of 310,000 or more and set with an internal hash function of HMAC-SHA-256.
  • Consider using a pepper to provide additional defense in depth (though alone, it provides no additional secure characteristics).

All these options - Argon2, bcrypt, scrypt, PBKDF2 - are designed to be computationally demanding.
They also expose configuration parameters (e.g. work factor for bcrypt) to further slow down hash computation: application developers can tune a few knobs to keep up with hardware speed-ups - no need to migrate to newer algorithms every couple of years.

Let's replace SHA-3 with Argon2id, as recommended by OWASP.
The Rust Crypto organization got us covered once again - they provide a pure-Rust implementation, argon2.

Let's add it to our dependencies:

#! Cargo.toml
#! [...]

[dependencies]
# [...]
argon2 = { version = "0.3", features = ["std"] }

To hash a password we need to create an Argon2 struct instance.
The new method signature looks like this:

//! argon2/lib.rs
/// [...]
 
impl<'key> Argon2<'key> {
    /// Create a new Argon2 context.
    pub fn new(algorithm: Algorithm, version: Version, params: Params) -> Self {
        // [...]
    }
    // [...]
}

Algorithm is an enum: it lets us select which variant of Argon2 we want to use - Argon2d, Argon2i, Argon2id. To comply with OWASP's recommendation we will go for Algorithm::Argon2id.

Version fulfills a similar purpose - we will go for the most recent, Version::V0x13.

What about Params?
Params::new specifies all the mandatory parameters we need to provide to build one:

//! argon2/params.rs
// [...]

/// Create new parameters.
pub fn new(
    m_cost: u32, 
    t_cost: u32, 
    p_cost: u32, 
    output_len: Option<usize>
) -> Result<Self> {
    // [...]
}

m_cost, t_cost and p_cost map to OWASP's requirements:

output_len, instead, determines the length of the returned hash - if omitted, it will default to 32 bytes. That is equal to 256 bits, the same hash length we were getting via SHA3-256.

We know enough, at this point, to build one:

//! src/routes/newsletters.rs
use argon2::{Algorithm, Argon2, Version, Params};
// [...]

async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    let hasher = Argon2::new(
        Algorithm::Argon2id,
        Version::V0x13,
        Params::new(15000, 2, 1, None)
            .context("Failed to build Argon2 parameters")
            .map_err(PublishError::UnexpectedError)?,
    );
    let password_hash = sha3::Sha3_256::digest(credentials.password.as_bytes());
    // [...]
}

Argon2 implements the PasswordHasher trait:

//! password_hash/traits.rs

pub trait PasswordHasher {
    // [...]
    fn hash_password<'a, S>(
        &self, 
        password: &[u8], 
        salt: &'a S
    ) -> Result<PasswordHash<'a>>
    where
        S: AsRef<str> + ?Sized;
}

It is a re-export from the password-hash crate, a unified interface to work with password hashes backed by a variety of algorithm (currently Argon2, PBKDF2 and scrypt).

PasswordHasher::hash_password is a bit different from Sha3_256::digest - it is asking for an additional parameter on top of the raw password, a salt.

3.3.7. Salting

Argon2 is a lot slower than SHA-3, but this is not enough to make a dictionary attack unfeasible. It takes longer to hash the most common 10 million passwords, but not prohibitively long.

What if, though, the attacker had to rehash the whole dictionary for every user in our database?
It becomes a lot more challenging!

That is what salting accomplishes. For each user, we generate a unique random string - the salt.
The salt is prepended to the user password before generating the hash. PasswordHasher::hash_password takes care of the prepending business for us.

The salt is stored next to the password hash, in our database.
If an attacker gets their hands on a database backup, they will have access to all salts6.
But they have to compute dictionary_size * n_users hashes instead of dictionary_size. Furthermore, pre-computing the hashes is no longer an option - this buys us time to detect the breach and take action (e.g. force a password reset for all users).

Let's add a password_salt column to our users table:

sqlx migrate add add_salt_to_users
-- migrations/20210815112111_add_salt_to_users.sql 
ALTER TABLE users ADD COLUMN salt TEXT NOT NULL;

We can no longer compute the hash before querying the users table - we need to retrieve the salt first.
Let's shuffle operations around:

//! src/routes/newsletters.rs
// [...]
use argon2::PasswordHasher;

async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    let hasher = argon2::Argon2::new(/* */);
    let row: Option<_> = sqlx::query!(
        r#"
        SELECT user_id, password_hash, salt
        FROM users
        WHERE username = $1
        "#,
        credentials.username,
    )
    .fetch_optional(pool)
    .await
    .context("Failed to perform a query to retrieve stored credentials.")
    .map_err(PublishError::UnexpectedError)?;

    let (expected_password_hash, user_id, salt) = match row {
        Some(row) => (row.password_hash, row.user_id, row.salt),
        None => {
            return Err(PublishError::AuthError(anyhow::anyhow!(
                "Unknown username."
            )));
        }
    };

    let password_hash = hasher
        .hash_password(credentials.password.as_bytes(), &salt)
        .context("Failed to hash password")
        .map_err(PublishError::UnexpectedError)?;
    
    let password_hash = format!("{:x}", password_hash.hash.unwrap());

    if password_hash != expected_password_hash {
        Err(PublishError::AuthError(anyhow::anyhow!(
            "Invalid password."
        )))
    } else {
        Ok(user_id)
    }
}

Unfortunately, this does not compile:

error[E0277]: the trait bound 
`argon2::password_hash::Output: LowerHex` is not satisfied
   --> src/routes/newsletters.rs
    |
125 |     let password_hash = format!("{:x}", password_hash.hash.unwrap());
    |                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
    the trait `LowerHex` is not implemented for `argon2::password_hash::Output`

Output provides other methods to obtain a string representation - e.g. Output::b64_encode. It would work, as long as we are happy to change the assumed encoding for hashes stored in our database.

Given that a change is necessary, we can shoot for something better than base64-encoding.

3.3.8. PHC String Format

To authenticate a user, we need reproducibility: we must run the very same hashing routine every single time.
Salt and password are just a subset of the inputs for Argon2id. All the other load parameters (t_cost, m_cost, p_cost) are equally important to obtain the same hash given the same pair of salt and password.

If we store a base64-encoded representation of the hash, we are making a strong implicit assumption: all values stored in the password_hash column have been computed using the same load parameters.

As we discussed a few sections ago, hardware capabilities evolve over time: application developers are expected to keep up by increasing the computational cost of hashing using higher load parameters.
What happens when you have to migrate your stored passwords to a newer hashing configuration?

To keep authenticating old users we must store, next to each hash, the exact set of load parameters used to compute it.
This allows for a seamless migration between two different load configurations: when an old user authenticates, we verify password validity using the stored load parameters; we then recompute the password hash using the new load parameters and update the stored information accordingly.

We could go for the naive approach - add three new columns to our users table: t_cost, m_cost and p_cost.
It would work, as long as the algorithm remains Argon2id.

What happens if a vulnerability is found in Argon2id and we are forced to migrate away from it?
We'd probably want to add an algorithm column, as well as new columns to store the load parameters of Argon2id's replacement.

It can be done, but it is tedious.
Luckily enough, there is a better solution: the PHC string format. The PHC string format provides a standard representation for a password hash: it includes the hash itself, the salt, the algorithm and all its associated parameters.

Using the PHC string format, an Argon2id password hash looks like this:

# ${algorithm}${algorithm version}${$-separated algorithm parameters}${hash}${salt}
$argon2id$v=19$m=65536,t=2,p=1$gZiV/M1gPc22ElAH/Jh1Hw$CWOrkoo7oJBQ/iyh7uJ0LO2aLEfrHwTWllSAxT0zRno

The argon2 crate exposes PasswordHash, a Rust implementation of the PHC format:

//! argon2/lib.rs
// [...]

pub struct PasswordHash<'a> {
    pub algorithm: Ident<'a>,
    pub version: Option<Decimal>,
    pub params: ParamsString,
    pub salt: Option<Salt<'a>>,
    pub hash: Option<Output>,
}

Storing password hashes in PHC string format spares us from having to initialise the Argon2 struct using explicit parameters7.
We can rely on Argon2's implementation of the PasswordVerifier trait:

pub trait PasswordVerifier {
    fn verify_password(
        &self,
        password: &[u8],
        hash: &PasswordHash<'_>
    ) -> Result<()>;
}

By passing the expected hash via PasswordHash, Argon2 can automatically infer what load parameters and salt should be used to verify if the password candidate is a match8.

Let's update our implementation:

//! src/routes/newsletters.rs
use argon2::{Argon2, PasswordHash, PasswordVerifier};
// [...]


async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    let row: Option<_> = sqlx::query!(
        r#"
        SELECT user_id, password_hash
        FROM users
        WHERE username = $1
        "#,
        credentials.username,
    )
    .fetch_optional(pool)
    .await
    .context("Failed to perform a query to retrieve stored credentials.")
    .map_err(PublishError::UnexpectedError)?;

    let (expected_password_hash, user_id) = match row {
        Some(row) => (row.password_hash, row.user_id),
        None => {
            return Err(PublishError::AuthError(anyhow::anyhow!(
                "Unknown username."
            )))
        }
    };

    let expected_password_hash = PasswordHash::new(&expected_password_hash)
        .context("Failed to parse hash in PHC string format.")
        .map_err(PublishError::UnexpectedError)?;

    Argon2::default()
        .verify_password(credentials.password.as_bytes(), &expected_password_hash)
        .context("Invalid password.")
        .map_err(PublishError::AuthError)?;

    Ok(user_id)
}

It compiles successfully.
You might have also noticed that we no longer deal with the salt directly - PHC string format takes care of it for us, implicitly.
We can get rid of the salt column entirely:

sqlx migrate add remove_salt_from_users
-- migrations/20210815112222_remove_salt_from_users.sql 
ALTER TABLE users DROP COLUMN salt;

What about our tests?
Two of them are failing:

---- newsletter::newsletters_are_not_delivered_to_unconfirmed_subscribers stdout ----
'newsletter::newsletters_are_not_delivered_to_unconfirmed_subscribers' panicked at 
'assertion failed: `(left == right)`
  left: `500`,
 right: `200`',

---- newsletter::newsletters_are_delivered_to_confirmed_subscribers stdout ----
'newsletter::newsletters_are_delivered_to_confirmed_subscribers' panicked at 
'assertion failed: `(left == right)`
  left: `500`,

We can look at logs to figure out what is wrong:

TEST_LOG=true cargo t newsletters_are_not_delivered | bunyan
[2021-08-29T20:14:50.367Z] ERROR: [HTTP REQUEST - EVENT] 
  Error encountered while processing the incoming HTTP request: 
  Failed to parse hash in PHC string format.

  Caused by:
     password hash string invalid

Let's look at the password generation code for our test user:

//! tests/api/helpers.rs
// [...]

impl TestUser {
    // [...]
    async fn store(&self, pool: &PgPool) {
        let password_hash = sha3::Sha3_256::digest(self.password.as_bytes());
        let password_hash = format!("{:x}", password_hash);
        // [...]
    }
}

We are still using SHA-3!
Let's update it:

//! tests/api/helpers.rs
use argon2::password_hash::SaltString;
use argon2::{Argon2, PasswordHasher};
// [...]

impl TestUser {
    // [...]
    async fn store(&self, pool: &PgPool) {
        let salt = SaltString::generate(&mut rand::thread_rng());
        // We don't care about the exact Argon2 parameters here
        // given that it's for testing purposes!
        let password_hash = Argon2::default()
            .hash_password(self.password.as_bytes(), &salt)
            .unwrap()
            .to_string();
        // [...]
    }
}

The test suite should pass now.
We have removed all mentions of sha3 from our project - we can now remove it from the list of dependencies in Cargo.toml.

3.4. Do Not Block The Async Executor

How long is it taking to verify user credentials when running our integration tests?
We currently do not have a span around password hashing - let's fix it:

//! src/routes/newsletters.rs
// [...]

#[tracing::instrument(name = "Validate credentials", skip(credentials, pool))]
async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    let (user_id, expected_password_hash) = get_stored_credentials(
            &credentials.username, 
            &pool
        )
        .await
        .map_err(PublishError::UnexpectedError)?
        .ok_or_else(|| PublishError::AuthError(anyhow::anyhow!("Unknown username.")))?;

    let expected_password_hash = PasswordHash::new(&expected_password_hash)
        .context("Failed to parse hash in PHC string format.")
        .map_err(PublishError::UnexpectedError)?;

    tracing::info_span!("Verify password hash")
        .in_scope(|| {
            Argon2::default()
                .verify_password(
                    credentials.password.as_bytes(), 
                    &expected_password_hash
                )
        })
        .context("Invalid password.")
        .map_err(PublishError::AuthError)?;

    Ok(user_id)
}

// We extracted the db-querying logic in its own function with its own span.
#[tracing::instrument(name = "Get stored credentials", skip(username, pool))]
async fn get_stored_credentials(
    username: &str,
    pool: &PgPool,
) -> Result<Option<(uuid::Uuid, String)>, anyhow::Error> {
    let row = sqlx::query!(
        r#"
        SELECT user_id, password_hash
        FROM users
        WHERE username = $1
        "#,
        username,
    )
    .fetch_optional(pool)
    .await
    .context("Failed to perform a query to retrieve stored credentials.")?
    .map(|row| (row.user_id, row.password_hash));
    Ok(row)
}

We can now look at the logs from one of our integration tests:

TEST_LOG=true cargo test --quiet --release \
  newsletters_are_delivered | grep "VERIFY PASSWORD" | bunyan
[...]  [VERIFY PASSWORD HASH - END] (elapsed_milliseconds=11, ...)

Roughly 10ms.
This is likely to cause issues under load - the infamous blocking problem.

async/await in Rust is built around a concept called cooperative scheduling.

How does it work?
Let's look at an example:

async fn my_fn() {
    a().await;
    b().await;
    c().await;
}

my_fn returns a Future.
When the future is awaited, our async runtime (tokio) enters into the picture: it starts polling it.

How is poll implemented for the Future returned by my_fn?
You can think of it as a state machine:

enum MyFnFuture {
    Initialized,
    CallingA,
    CallingB,
    CallingC,
    Complete
}

Every time poll is called, it tries to make progress by reaching the next state. E.g. if a.await() has returned, we start awaiting b()9.

We have a different state in MyFnFuture for each .await in our async function body.
This is why .await calls are often named yield points - our future progresses from the previous .await to the next one and then yields control back to the executor.

The executor can then choose to poll the same future again or to prioritise making progress on another task. This is how async runtimes, like tokio, manage to make progress concurrently on multiple tasks - by continuously parking and resuming each of them.
In a way, you can think of async runtimes as great jugglers.

The underlying assumption is that most async tasks are performing some kind of input-output (IO) work - most of their execution time will be spent waiting on something else to happen (e.g. the operating system notifying us that there is data ready to be read on a socket), therefore we can effectively perform many more tasks concurrently than we what we would achieve by dedicating a parallel unit of execution (e.g. one thread per OS core) to each task.

This model works great assuming tasks cooperate by frequently yielding control back to the executor.
In other words, poll is expected to be fast - it should return in less than 10-100 microseconds10. If a call to poll takes longer (or, even worse, never returns), then the async executor cannot make progress on any other task - this is what people refer to when they say that "a task is blocking the executor/the async thread".

You should always be on the lookout for CPU-intensive workloads that are likely to take longer than 1ms - password hashing is a perfect example.
To play nicely with tokio, we must offload our CPU-intensive task to a separate threadpool using tokio::task::spawn_blocking. Those threads are reserved for blocking operations and do not interfere with the scheduling of async tasks.

actix_web provides a wrapper on top of tokio's spawn_blocking - that's what we will be relying on.

Let's get to work!

//! src/routes/newsletters.rs
// [...]

#[tracing::instrument(name = "Validate credentials", skip(credentials, pool))]
async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    // [...]
    actix_web::rt::task::spawn_blocking(move || {
        tracing::info_span!("Verify password hash").in_scope(|| {
            Argon2::default()
                .verify_password(
                    credentials.password.as_bytes(), 
                    &expected_password_hash)
        })
    })
    .await
    // spawn_blocking is fallible - we have a nested Result here!
    .context("Failed to spawn blocking task.")
    .map_err(PublishError::UnexpectedError)?
    .context("Invalid password.")
    .map_err(PublishError::AuthError)?;
    // [...]
}

The borrow checker is not happy:

error[E0597]: `expected_password_hash` does not live long enough
   --> src/routes/newsletters.rs
    |
117 |     PasswordHash::new(&expected_password_hash)
    |     ------------------^^^^^^^^^^^^^^^^^^^^^^^-
    |     |                 |
    |     |                 borrowed value does not live long enough
    |     argument requires that `expected_password_hash` is borrowed for `'static`
...
134 | }
    | - `expected_password_hash` dropped here while still borrowed

We are launching a computation on a separate thread - the thread itself might outlive the async task we are spawning it from. To avoid the issue, spawn_blocking requires its argument to have a 'static lifetime - which is preventing us from passing references to the current function context into the closure.

You might argue - "We are using move || {}, the closure should be taking ownership of expected_password_hash!".
You would be right! But that is not enough.
Let's look again at how PasswordHash is defined:

pub struct PasswordHash<'a> {
    pub algorithm: Ident<'a>,
    pub salt: Option<Salt<'a>>,
    // [...]
}

It holds a reference to the string it was parsed from.
We need move ownership of the original string into our closure, moving the parsing logic into it as well.

Let's create a separate function, verify_password_hash, for clarity:

//! src/routes/newsletters.rs
// [...]

#[tracing::instrument(name = "Validate credentials", skip(credentials, pool))]
async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    // [...]

    actix_web::rt::task::spawn_blocking(move || {
        verify_password_hash(
            expected_password_hash, 
            credentials.password
        )
    })
    .await
    .context("Failed to spawn blocking task.")
    .map_err(PublishError::UnexpectedError)??;

    Ok(user_id)
}

#[tracing::instrument(
    name = "Verify password hash", 
    skip(expected_password_hash, password_candidate)
)]
fn verify_password_hash(
    expected_password_hash: String,
    password_candidate: String,
) -> Result<(), PublishError> {
    let expected_password_hash = PasswordHash::new(&expected_password_hash)
        .context("Failed to parse hash in PHC string format.")
        .map_err(PublishError::UnexpectedError)?;

    Argon2::default()
        .verify_password(password_candidate.as_bytes(), &expected_password_hash)
        .context("Invalid password.")
        .map_err(PublishError::AuthError)
}

It compiles!

3.4.1. Tracing Context Is Thread-Local

Let's look again at the logs for the verify password hash span:

TEST_LOG=true cargo test --quiet --release \
  newsletters_are_delivered | grep "VERIFY PASSWORD" | bunyan
[2021-08-30T10:03:07.613Z]  [VERIFY PASSWORD HASH - START] 
  (file="...", line="...", target="...")
[2021-08-30T10:03:07.624Z]  [VERIFY PASSWORD HASH - END]
  (file="...", line="...", target="...")

We are missing all the properties that are inherited from the root span of the corresponding request - e.g. request_id, http.method, http.route, etc. Why?

Let's look at tracing's documentation:

Spans form a tree structure — unless it is a root span, all spans have a parent, and may have one or more children. When a new span is created, the current span becomes the new span's parent.

The current span is the one returned by tracing::Span::current() - let's check its documentation:

Returns a handle to the span considered by the Collector to be the current span.

If the collector indicates that it does not track the current span, or that the thread from which this function is called is not currently inside a span, the returned span will be disabled.

"Current span" actually means "active span for the current thread".
That is why we are not inheriting any property: we are spawning our computation on a separate thread and tracing::info_span! does not find any active Span associated with it when it executes.

We can work around the issue by explicitly attaching the current span to the newly spawn thread:

//! src/routes/newsletters.rs
// [...]

#[tracing::instrument(name = "Validate credentials", skip(credentials, pool))]
async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    // [...]
    // This executes before spawning the new thread
    let current_span = tracing::Span::current();
    actix_web::rt::task::spawn_blocking(move || {
        // We then pass ownership to it into the closure
        // and explicitly executes all our computation
        // within its scope.
        current_span.in_scope(|| {
            verify_password_hash(/* */)
        })
    })
    // [...]
}

You can verify that it works - we are now getting all the properties we care about.
It is a bit verbose though - let's write a helper function:

//! src/telemetry.rs
use actix_web::rt::task::JoinHandle;
// [...]

// Just copied trait bounds and signature from `spawn_blocking`
pub fn spawn_blocking_with_tracing<F, R>(f: F) -> JoinHandle<R>
where
    F: FnOnce() -> R + Send + 'static,
    R: Send + 'static,
{
    let current_span = tracing::Span::current();
    actix_web::rt::task::spawn_blocking(move || current_span.in_scope(f))
}
//! src/routes/newsletters.rs
use crate::telemetry::spawn_blocking_with_tracing;
// [...]

#[tracing::instrument(name = "Validate credentials", skip(credentials, pool))]
async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    // [...]
    spawn_blocking_with_tracing(move || {
        verify_password_hash(/* */)
    })
    // [...]
}

We can now easily reach for it every time we need to offload some CPU-intensive computation to a dedicated threadpool.

3.5. User Enumeration

Let's add a new test case:

//! tests/api/newsletters.rs
// [...]

#[actix_rt::test]
async fn non_existing_user_is_rejected() {
    // Arrange
    let app = spawn_app().await;
    // Random credentials
    let username = Uuid::new_v4().to_string();
    let password = Uuid::new_v4().to_string();

    let response = reqwest::Client::new()
        .post(&format!("{}/newsletters", &app.address))
        .basic_auth(username, Some(password))
        .json(&serde_json::json!({
            "title": "Newsletter title",
            "content": {
                "text": "Newsletter body as plain text",
                "html": "<p>Newsletter body as HTML</p>",
            }
        }))
        .send()
        .await
        .expect("Failed to execute request.");

    // Assert
    assert_eq!(401, response.status().as_u16());
    assert_eq!(
        r#"Basic realm="publish""#,
        response.headers()["WWW-Authenticate"]
    );
}

The test should pass straight-away.
How long does it take though?

Let's look at the logs!

TEST_LOG=true cargo test --quiet --release \
  non_existing_user_is_rejected | grep "HTTP REQUEST" | bunyan
# [...] Omitting setup requests
[...] [HTTP REQUEST - END]
  (http.route = "/newsletters", elapsed_milliseconds=1, ...)

Roughly 1ms.

Let's add another test: this time we pass a valid username with an incorrect password.

//! tests/api/newsletters.rs
// [...]

#[actix_rt::test]
async fn invalid_password_is_rejected() {
    // Arrange
    let app = spawn_app().await;
    let username = &app.test_user.username;
    // Random password
    let password = Uuid::new_v4().to_string();
    assert_ne!(app.test_user.password, password);

    let response = reqwest::Client::new()
        .post(&format!("{}/newsletters", &app.address))
        .basic_auth(username, Some(password))
        .json(&serde_json::json!({
            "title": "Newsletter title",
            "content": {
                "text": "Newsletter body as plain text",
                "html": "<p>Newsletter body as HTML</p>",
            }
        }))
        .send()
        .await
        .expect("Failed to execute request.");

    // Assert
    assert_eq!(401, response.status().as_u16());
    assert_eq!(
        r#"Basic realm="publish""#,
        response.headers()["WWW-Authenticate"]
    );
}
//! tests/api/helpers.rs
// [...]

pub struct TestUser {
    user_id: Uuid,
    // Marking both fields as pub!
    pub username: String,
    pub password: String,
}

This one should pass as well. How long does the request take to fail?

TEST_LOG=true cargo test --quiet --release \
  invalid_password_is_rejected | grep "HTTP REQUEST" | bunyan
# [...] Omitting setup requests
[...] [HTTP REQUEST - END]
  (http.route = "/newsletters", elapsed_milliseconds=11, ...)

Roughly 10ms - it is one order of magnitude smaller!
We can use this difference to our advantage to perform a timing attack, a member of the broader class of side-channel attacks.

If an attacker knows at least one valid username, they can inspect the server response times11 to confirm if another username exists or not - we are looking at a potential user enumeration vulnerability.
Is this an issue?

It depends.
If you are running Gmail, there are plenty of other ways to find out if a @gmail.com email address exists or not. The validity of an email address is not a secret!

If you are running a SaaS product, the situation might be more nuanced.
Let's go for a fictional scenario: your SaaS product provides payroll services and uses email addresses as usernames. There are separate employee and admin login pages.
My goal is to get access to payroll data - I need to compromise an employee with privileged access. We can scrape LinkedIn to get the name and surnames of all employees in the Finance department. Corporate emails follow a predictable structure (name.surname@payrollaces.com), so we have a list of candidates.
We can now perform a timing attack against the admin login page to narrow down the list to those who have access.

Even in our fictional example, user enumeration is not enough, on its own, to escalate our privileges.
But it can be used as a stepping stone to narrow down a set of targets for a more precise attack.

How do we prevent it?
Two strategies:

  1. Remove the timing difference between an auth failure due to an invalid password and an auth failure due to a non-existent username;
  2. Limit the number of failed auth attempts for a given IP/username.

The second is generally valuable as a protection against brute-force attacks, but it requires holding some state - we will leave it for later.

Let's focus on the first one.
To eliminate the timing difference, we need to perform the same amount of work in both cases.

Right now, we follow this recipe:

We need to remove that early exit - we should have a fallback expected password (with salt and load parameters) that can be compared to the hash of the password candidate.

//! src/routes/newsletters.rs
// [...]

#[tracing::instrument(name = "Validate credentials", skip(credentials, pool))]
async fn validate_credentials(
    credentials: Credentials,
    pool: &PgPool,
) -> Result<uuid::Uuid, PublishError> {
    let mut user_id = None;
    let mut expected_password_hash = "$argon2id$v=19$m=15000,t=2,p=1$\
        gZiV/M1gPc22ElAH/Jh1Hw$\
        CWOrkoo7oJBQ/iyh7uJ0LO2aLEfrHwTWllSAxT0zRno"
        .to_string();

    if let Some((stored_user_id, stored_password_hash)) =
        get_stored_credentials(&credentials.username, &pool)
            .await
            .map_err(PublishError::UnexpectedError)?
    {
        user_id = Some(stored_user_id);
        expected_password_hash = stored_password_hash;
    }

    spawn_blocking_with_tracing(move || {
        verify_password_hash(expected_password_hash, credentials.password)
    })
    .await
    .context("Failed to spawn blocking task.")
    .map_err(PublishError::UnexpectedError)??;

    // This is only set to `Some` if we found credentials in the store
    // So, even if the default password ends up matching (somehow)
    // with the provided password, 
    // we never authenticate a non-existing user.
    // You can easily add a unit test for that precise scenario.
    user_id.ok_or_else(|| 
        PublishError::AuthError(anyhow::anyhow!("Unknown username."))
    )
}
//! tests/api/helpers.rs
use argon2::{Algorithm, Argon2, Params, PasswordHasher, Version};
// [...]

impl TestUser {
    async fn store(&self, pool: &PgPool) {
        let salt = SaltString::generate(&mut rand::thread_rng());
        // Match parameters of the default password
        let password_hash = Argon2::new(
            Algorithm::Argon2id,
            Version::V0x13,
            Params::new(15000, 2, 1, None).unwrap(),
        )
        .hash_password(self.password.as_bytes(), &salt)
        .unwrap()
        .to_string();
        // [...]
    }
    // [...]
}

There should not be any statistically significant timing difference now.

4. Is it safe?

We went to great lengths to follow all most common best practices while building our password-based authentication flow.
Time to ask ourselves: is it safe?

4.1. Transport Layer Security (TLS)

We are using the 'Basic' Authentication Scheme to pass credentials between the client and the server - username and password are encoded, but not encrypted.
We must use Transport Layer Security (TLS) to ensure that nobody can eavesdrop the traffic between client and server to compromise the user credentials (a man-in-the-middle attack - MITM)12.
Our API is already served via HTTPS, so nothing to do here.

4.2. Password Reset

What happens if an attacker manages to steal a set of valid user credentials?
Passwords do not expire - they are long-lived secrets.

Right now, there is no way for a user to reset their passwords. This is definitely a gap we'd need to fill.

4.3. Interaction Types

So far we have been fairly vague about who is calling to our API.

The type of interaction we need to support is a key decision factor when it comes to authentication.

We will look at three categories of callers:

4.4. Machine To Machine

The consumer of your API might be a machine (e.g. another API).
This is often the case in a microservice architecture - your functionality emerges from a variety of services interacting over the network.

To significantly raise our security profile we'd have to throw in something they have (e.g. request signing) or something they are (e.g. IP range restrictions).
A popular option, when all service are owned by the same organization, is mutual TLS (mTLS).

Both signing and mTLS rely on of public key cryptography - keys must be provisioned, rotated, managed. The overhead is only justified once your system reaches a certain size.

4.4.1. Client Credentials via OAuth2

Another option is using the OAuth2 client credentials flow. We will speak more about OAuth2 later, but let's spend a few words on its tactical pros and cons.

APIs no longer have to manage passwords (client secrets, in OAuth2 terminology) - the concern is delegated to a centralised authorization server. There are multiple turn-key implementations of an authorization server out there - both OSS and commercial. You can lean on them instead of rolling your own.

The caller authenticates with the authorization server - if successful, the auth server grants them a set of temporary credentials (a JWT access token) which can be used to call our API.
Our API can verify the validity of the access token using public key cryptography, without having to keep any state. Our API never sees the actual password, the client secret.

JWT validation is not without its risks - the specification is riddled with dangerous edge cases. We will speak more about it later.

4.5. Person Via Browser

What if we are dealing with a person, using a web browser?

'Basic' Authentication requires the client to present their credentials on every single request.
We now have a single protected endpoint, but you can easily picture a situation with five or ten pages provided restricted functionality.

'Basic' Authentication would force the user to submit their credentials on every single page.
Not acceptable.

Session-based authentication is a common approach to solve this problem.
A user is asked to authenticate once, via a login form13: if successful, the server generates a one-time secret - an authenticated session token.
The token is stored in the browser as a secure cookie.
Sessions, unlike passwords, are designed to expire - this reduces the likelihood that a valid session token is compromised (especially if inactive users are automatically logged out). It also prevents the user from having to reset their password if there is a suspicion that their session has been hijacked - a forced log out is much more acceptable than an automated password reset.

OWASP provides extensive guidance on session management.

4.5.1. Federated Identity

With session-based authentication we still have an authentication step to take care of - the login form.
We can keep rolling our own - everything we learned about passwords is still relevant, even if we ditch the 'Basic' Authentication scheme.

Many websites choose to offer their users an additional option: login via a Social profile - e.g. "Log in with Google". This removes friction from the sign up flow (no need to create yet another password!), increasing conversion - a desirable outcome.

Social logins rely on identity federation - we delegate the authentication step to a third-party identity provider, which in turn shares with us the pieces of information we asked for (e.g. email address, full name and date of birth).

A common implementation of identity federation relies on OpenID Connect, an identity layer on top of the OAuth2 standard.

4.6. Machine to machine, on behalf of a person

There is one more scenario: a person authorising a machine (e.g. a third-party service) to perform actions against our API on their behalf.
E.g. a mobile app that provides an alternative UI for Twitter.

It is important to stress how this differs from the first scenario we reviewed, pure machine-to-machine authentication.
In this case, the third-party service is not authorised, on its own, to perform any action against our API. The third-party service can only perform actions against our API if a user grants them access, scoped to their set of permissions.
I can install a mobile app to write tweets on my behalf, but I can't authorise it to tweet on behalf of David Guetta.

'Basic' authentication would be a very poor fit here: we do not want to share our password with a third-party app. The more parties get to see our password, the more likely it is to be compromised.

Furthermore, keeping an audit trail with shared credentials is a nightmare. When something goes wrong, it is impossible to determine who did what: was it actually me? Was it one of the twenty apps I shared credentials with? Who takes responsibility?

This is the textbook scenario for OAuth 2 - the third-party never gets to see our username and password. They receive an opaque access token from the authentication server which our API knows how to inspect to grant (or deny) access.

5. What Should We Do Next

We established that 'Basic' Authentication is only suitable for machine-to-machine authentication.
If we want to expose our API to a person via a browser, we need a login form and session-based authentication for further interactions.
We can definitely reuse most of the password machinery we built so far.

There is a clear benefit in bringing delegation into the picture - we need to understand better what OAuth2 or OAuth2-powered schemes are about (e.g. OpenID Connect).

That is pretty much the roadmap for the next episode!
We will first convert our 'Basic' Authentication flow into a login form with session-based auth.
We will then look into OpenID Connect to offer a federated login.

See you soon!


Zero To Production In Rust is a hands-on introduction to backend development in Rust.
Subscribe to the newsletter to be notified when a new episode is published.

5. Footnotes

Click to expand!
1

base64-encoding ensures that all the characters in the output are ASCII, but it does not provide any kind of protection: decoding requires no secrets. In other words, encoding is not encryption!

2

Assuming that the input space is finite (i.e. password length is capped), it is theoretically possible to find a perfect hash function - f(x) == f(y) implies x == y.

3

When looking into brute-force attacks you will often see mentions of rainbow tables - an efficient data structure to pre-compute and lookup hashes.

4

This back-of-the-envelope calculation should make it clear that using a randomly-generated password provides you, as a user, with a significant level of protection against brute-force attacks even if the server is using fast hashing algorithms for password storage. Consistent usage of a password manager is indeed one of the easiest ways to boost your security profile.

5

OWASP is, generally speaking, a treasure trove of great educational material about security for web applications. You should get as familiar as possible with OWASP's material, especially if you do not have an application security specialist in your team/organization to support you. On top of the cheatsheet we linked, make sure to browse their Application Security Verification Standard.

6

This is why OWASP recommends an additional layer of defence - peppering. All hashes stored in the database are encrypted using a shared secret, only known to the application. Encryption, though, brings its own set of challenges: where are we going to store the key? How do we rotate it? The answer usually involves a Hardware Security Module (HSM) or a secret vault, such as AWS CloudHSM, AWS KMS or Hashicorp Vault. A thorough overview of key management is beyond the scope of this book.

7

I have not delved too deep into the source code of the different hash algorithms that implement PasswordVerifier, but I do wonder why verify_password needs to take &self as a parameter. Argon2 has absolutely no use for it, but it forces us to go through an Argon2::default in order to call verify_password.

8

PasswordVerifier::verify_password does one more thing - it leans on Output to compare the two hashes, instead of working with raw bytes. Output's implementations of PartialEq and Eq are designed to be evaluated in constant-time - no matter how different or similar the inputs are, function execution will take the same amount of time. Assuming an attacker had perfect knowledge of the hashing algorithm configuration the server is using, they could analyze the response time for each authentication attempt to infer the first bytes of the password hash - combined with a dictionary, this could help them to crack the password. The feasibility of such an attack is debatable, even more so when salting is in place. Nonetheless, it costs us nothing - so better safe than sorry.

9

Our example is oversimplified, on purpose. In reality, each of those states will have sub-states in turn - one for each .await in the body of the function we are calling. A future can turn into a deeply nested state machine!

10

This heuristic is reported in "Async: What is blocking?" by Alice Rhyl, one of tokio's maintainers. An article I'd strongly suggest you to read to understand better the underlying mechanics of tokio and async/await in general!

11

In a real life scenario, there is a network between an attacker and your server. Load and network variance are likely to mask the speed difference on a limited set of attempts, but if you collect enough data points it should be possible to notice a statistically significant difference in latency.

12

Which is why you should never enter your password into a website that is not using HTTPS - i.e. HTTP + TLS.

13

Implementing a secure login form is its own challenge - hello CSRF! We will be taking a much closer look at it later in this chapter.

Book - Table Of Contents

Click to expand!

The Table of Contents is provisional and might change over time. The draft below is the most accurate picture at this point in time.

  1. Getting Started
    • Installing The Rust Toolchain
    • Project Setup
    • IDEs
    • Continuous Integration
  2. Our Driving Example
    • What Should Our Newsletter Do?
    • Working In Iterations
  3. Sign Up A New Subscriber
  4. Telemetry
    • Unknown Unknowns
    • Observability
    • Logging
    • Instrumenting /POST subscriptions
    • Structured Logging
  5. Go Live
    • We Must Talk About Deployments
    • Choosing Our Tools
    • A Dockerfile For Our Application
    • Deploy To DigitalOcean Apps Platform
  6. Rejecting Invalid Subscribers #1
    • Requirements
    • First Implementation
    • Validation Is A Leaky Cauldron
    • Type-Driven Development
    • Ownership Meets Invariants
    • Panics
    • Error As Values - Result
  7. Reject Invalid Subscribers #2
  8. Error Handling
    • What Is The Purpose Of Errors?
    • Error Reporting For Operators
    • Errors For Control Flow
    • Avoid "Ball Of Mud" Error Enums
    • Who Should Log Errors?
  9. Naive Newsletter Delivery
    • User Stories Are Not Set In Stone
    • Do Not Spam Unconfirmed Subscribers
    • All Confirmed Subscribers Receive New Issues
    • Implementation Strategy
    • Body Schema
    • Fetch Confirmed Subscribers List
    • Send Newsletter Emails
    • Validation Of Stored Data
    • Limitations Of The Naive Approach
  10. Securing Our API
  11. Fault-tolerant Newsletter Delivery