How To Bootstrap A Rust Web API From Scratch

August 09, 2020

7571 words

38 min

This article is a sample from Zero To Production In Rust, a hands-on introduction to backend development in Rust.
You can get a copy of the book at zero2prod.com.

Chapter #3

Discuss the article on HackerNews or r/rust.

We spent the whole Chapter 2 defining what we will be building (an email newsletter!), narrowing down a precise set of requirements. It is now time to roll up our sleeves and get started with it.

This chapter will take a first stab at implementing this user story:

As a blog visitor,
I want to subscribe to the newsletter,
So that I can receive email updates when new content is published on the blog.

We expect our blog visitors to input their email address in a form embedded on a web page.
The form will trigger an API call to a backend server that will actually process the information, store it and send back a response.
This chapter will focus on that backend server - we will implement the /subscribe POST endpoint.

1.1. Our Strategy

We are starting a new project from scratch - there is a fair amount of upfront heavy-lifting we need to take care of:

choose a web framework and get familiar with it;
define our testing strategy;
choose a crate to interact with our database (we will have to save those emails somewhere!);
define how we want to manage changes to our database schemas over time (a.k.a. migrations);
actually write some queries.

That is a lot and jumping in head-first might be overwhelming.
We will add a stepping stone to make the journey more approachable: before tackling /subscribe we will implement a /health_check endpoint. No business logic, but a good opportunity to become friends with our web framework and get an understanding of all its different moving parts.

We will be relying on our Continuous Integration pipeline to keep us in check throughout the process - if you have not set it up yet, have a quick look at Chapter 1 (or grab one of the ready-made templates).

2. Choosing A Web Framework

What web framework should we use to write our Rust API?
This was supposed to be a section on the pros and cons of the Rust web frameworks currently available. It eventually grew to be so long that it did not make sense to embed it here and I published it as a spin-off article: check out Choosing a Rust web framework, 2020 edition for a deep-dive on actix-web, rocket, tide and warp.

TL;DR: as of March 2022, actix-web should be your go-to web framework when it comes to Rust APIs aimed for production usage - it has seen extensive usage in the past couple of years, it has a large and healthy community behind it and it runs on tokio, therefore minimising the likelihood of having to deal with incompatibilities/interop between different async runtimes.
It will thus be our choice for Zero To Production.

Nonetheless tide, rocket and warp have huge potential and we might end up making a different decision later in 2022 - if you are following along Zero To Production using a different framework I'd be delighted to have a look at your code! Please shoot me an email at contact@lpalmieri.com

Throughout this chapter and beyond I suggest you to keep a couple of extra browser tabs open: actix-web's website, actix-web's documentation and actix-web's examples collection.

3. Our First Endpoint: A Basic Health Check

Let's try to get off the ground by implementing a health-check endpoint: when we receive a GET request for /health_check we want to return a 200 OK response with no body.

We can use /health_check to verify that the application is up and ready to accept incoming requests. Combine it with a SaaS service like pingdom.com and you can be alerted when your API goes dark - quite a good baseline for an email newsletter that you are running on the side.

A health-check endpoint can also be handy if you are using a container orchestrator to juggle your application (e.g. Kubernetes or Nomad): the orchestrator can call /health_check to detect if the API has become unresponsive and trigger a restart.

3.1. Wiring Up actix-web

Our starting point will be an Hello World! application built with actix-web:

use actix_web::{web, App, HttpRequest, HttpServer, Responder};

async fn greet(req: HttpRequest) -> impl Responder {
    let name = req.match_info().get("name").unwrap_or("World");
    format!("Hello {}!", &name)
}

#[tokio::main]
async fn main() -> std::io::Result<()> {
    HttpServer::new(|| {
        App::new()
            .route("/", web::get().to(greet))
            .route("/{name}", web::get().to(greet))
    })
    .bind("127.0.0.1:8000")?
    .run()
    .await
}

Let's paste it in our main.rs file.
A quick cargo check¹:

error[E0432]: unresolved import `actix_web`
 --> src/main.rs:1:5
  |
1 | use actix_web::{web, App, HttpRequest, HttpServer, Responder};
  |     ^^^^^^^^^ use of undeclared type or module `actix_web`

error[E0433]: failed to resolve: 
    use of undeclared type or module `tokio`
 --> src/main.rs:8:3
  |
8 | #[tokio::main]
  |   ^^^^^^^^ use of undeclared type or module `tokio`

error: aborting due to 2 previous errors

We have not added actix-web and tokio to our list of dependencies, therefore the compiler cannot resolve what we imported.
We can either fix the situation manually, by adding

#! Cargo.toml
# [...]

[dependencies]
actix-web = "4"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

under [dependencies] in our Cargo.toml or we can use cargo add to quickly add the latest version of both crates as a dependency of our project:

cargo add actix-web@4
cargo add tokio@1 --features macros,rt-multi-thread

cargo add is not a default cargo command: it is provided by cargo-edit, a community-maintained² cargo extension. You can install it with:

cargo install cargo-edit

If you run cargo check again there should be no errors.
You can now launch the application with cargo run and perform a quick manual test:

curl http://127.0.0.1:8000

Hello World!

Cool, it's alive!
You can gracefully shut down the web server with Ctrl+C if you want to.

3.2. Anatomy of an `actix-web` application

Let's go back now to have a closer look at what we have just copy-pasted in our main.rs file.

//! src/main.rs
// [...]

#[tokio::main]
async fn main() -> std::io::Result<()> {
    HttpServer::new(|| {
        App::new()
            .route("/", web::get().to(greet))
            .route("/{name}", web::get().to(greet))
    })
    .bind("127.0.0.1:8000")?
    .run()
    .await
}

3.2.1. Server - `HttpServer`

HttpServer is the backbone supporting our application. It takes care of things like:

where should the application be listening for incoming requests? A TCP socket (e.g. 127.0.0.1:8000)? A Unix domain socket?
what is the maximum number of concurrent connections that we should allow? How many new connections per unit of time?
should we enable transport level security (TLS)?
etc.

HttpServer, in other words, handles all transport level concerns.
What happens afterwards? What does HttpServer do when it has established a new connection with a client of our API and we need to start handling their requests?
That is where App comes into play!

3.2.2. Application - `App`

App is where all your application logic lives: routing, middlewares, request handlers, etc.
App is the component whose job is to take an incoming request as input and spit out a response.
Let's zoom in on our code snippet:

App::new()
    .route("/", web::get().to(greet))
    .route("/{name}", web::get().to(greet))

App is a practical example of the builder pattern: new() gives us a clean slate to which we can add, one bit at a time, new behaviour using a fluent API (i.e. chaining method calls one after the other).
We will cover the majority of App's API surface on a need-to-know basis over the course of the whole book: by the end of our journey you should have touched most of its methods at least once.

3.2.3. Endpoint - `Route`

How do we add a new endpoint to our App?
The route method is probably the simplest way to go about doing it - it is used in a Hello World! example after all!

route takes two parameters:

path, a string, possibly templated (e.g. "/{name}") to accommodate dynamic path segments;
route, an instance of the Route struct.

Route combines a handler with a set of guards.
Guards specify conditions that a request must satisfy in order to "match" and be passed over to the handler. From an implementation standpoint guards are implementors of the Guard trait: Guard::check is where the magic happens.

In our snippet we have

.route("/", web::get().to(greet))

"/" will match all requests without any segment following the base path - i.e. http://localhost:8000/.
web::get() is a short-cut for Route::new().guard(guard::Get()) a.k.a. the request should be passed to the handler if and only if its HTTP method is GET.

You can start to picture what happens when a new request comes in: App iterates over all registered endpoints until it finds a matching one (both path template and guards are satisfied) and passes over the request object to the handler.
This is not 100% accurate but it is a good enough mental model for the time being.

What does a handler look like instead? What is its function signature?
We only have one example at the moment, greet:

async fn greet(req: HttpRequest) -> impl Responder {
    [...]
}

greet is an asynchronous function that takes a HttpRequest as input and returns something that implements the Responder trait³. A type implements the Responder trait if it can be converted into a HttpResponse - it is implemented off the shelf for a variety of common types (e.g. strings, status codes, bytes, HttpResponse, etc.) and we can roll our own implementations if needed.

Do all our handlers need to have the same function signature of greet?
No! actix-web, channelling some forbidden trait black magic, allows a wide range of different function signatures for handlers, especially when it comes to input arguments. We will get back to it soon enough.

3.2.4. Runtime - `tokio`

We drilled down from the whole HttpServer to a Route. Let's look again at the whole main function:

//! src/main.rs
// [...]

#[tokio::main]
async fn main() -> std::io::Result<()> {
    HttpServer::new(|| {
        App::new()
            .route("/", web::get().to(greet))
            .route("/{name}", web::get().to(greet))
    })
    .bind("127.0.0.1:8000")?
    .run()
    .await
}

What is #[tokio::main] doing here? Well, let's remove it and see what happens! cargo check screams at us with these errors:

error[E0277]: `main` has invalid return type `impl std::future::Future`
 --> src/main.rs:8:20
  |
8 | async fn main() -> std::io::Result<()> {
  |                    ^^^^^^^^^^^^^^^^^^^ 
  | `main` can only return types that implement `std::process::Termination`
  |
  = help: consider using `()`, or a `Result`

error[E0752]: `main` function is not allowed to be `async`
 --> src/main.rs:8:1
  |
8 | async fn main() -> std::io::Result<()> {
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
  | `main` function is not allowed to be `async`

error: aborting due to 2 previous errors

We need main to be asynchronous because HttpServer::run is an asynchronous method but main, the entrypoint of our binary, cannot be an asynchronous function. Why is that?

Asynchronous programming in Rust is built on top of the Future trait: a future stands for a value that may not be there yet. All futures expose a poll method which has to be called to allow the future to make progress and eventually resolve to its final value. You can think of Rust’s futures as lazy: unless polled, there is no guarantee that they will execute to completion. This has often been described as a pull model compared to the push model adopted by other languages⁴.

Rust's standard library, by design, does not include an asynchronous runtime: you are supposed to bring one into your project as a dependency, one more crate under [dependencies] in your Cargo.toml. This approach is extremely versatile: you are free to implement your own runtime, optimised to cater for the specific requirements of your usecase (see the Fuchsia project or bastion’s actor framework).

This explains why main cannot be an asynchronous function: who is in charge to call poll on it?
There is no special configuration syntax that tells the Rust compiler that one of your dependencies is an asynchronous runtime (e.g. as we do for allocators) and, to be fair, there is not even a standardised definition of what a runtime is (e.g. an Executor trait).
You are therefore expected to launch your asynchronous runtime at the top of your main function and then use it to drive your futures to completion.
You might have guessed by now what is the purpose of #[tokio::main], but guesses are not enough to satisfy us: we want to see it.

How?
tokio::main is a procedural macro and this is a great opportunity to introduce cargo expand, an awesome addition to our Swiss army knife for Rust development:

cargo install cargo-expand

Rust macros operate at the token level: they take in a stream of symbols (e.g. in our case, the whole main function) and output a stream of new symbols which then gets passed to the compiler. In other words, the main purpose of Rust macros is code generation.
How do we debug or inspect what is happening with a particular macro? You inspect the tokens it outputs!

That is exactly where cargo expand shines: it expands all macros in your code without passing the output to the compiler, allowing you to step through it and understand what is going on.
Let's use cargo expand to demystify #[tokio::main]:

cargo expand

Unfortunately, it fails:

error: the option `Z` is only accepted on the nightly compiler
error: could not compile `zero2prod`

We are using the stable compiler to build, test and run our code. cargo-expand, instead, relies on the nightly compiler to expand our macros.
You can install the nightly compiler by running

rustup toolchain install nightly --allow-downgrade

Some components of the bundle installed by rustup might be broken/missing on the latest nightly release: --allow-downgrade tells rustup to find and install the latest nightly where all the needed components are available.

You can use rustup default to change the default toolchain used by cargo and the other tools managed by rustup. In our case, we do not want to switch over to nightly - we just need it for cargo-expand.
Luckily enough, cargo allows us to specify the toolchain on a per-command basis:

# Use the nightly toolchain just for this command invocation
cargo +nightly expand

/// [...]

fn main() -> std::io::Result<()> {
    let body = async move {
        HttpServer::new(|| {
            App::new()
                .route("/", web::get().to(greet))
                .route("/{name}", web::get().to(greet))
        })
        .bind("127.0.0.1:8000")?
        .run()
        .await
    };
    tokio::runtime::Builder::new_multi_thread()
        .enable_all()
        .build()
        .expect("Failed building the Runtime")
        .block_on(body)
}

We can finally look at the code after macro expansion!
The main function that gets passed to the Rust compiler after #[tokio::main] has been expanded is indeed synchronous, which explain why it compiles without any issue.
The key line is this:

tokio::runtime::Builder::new_multi_thread()
    .enable_all()
    .build()
    .expect("Failed building the Runtime")
    .block_on(body)

We are starting tokio's async runtime and we are using it to drive the future returned by HttpServer::run to completion.
In other words, the job of #[tokio::main] is to give us the illusion of being able to define an asynchronous main while, under the hood, it just takes our main asynchronous body and writes the necessary boilerplate to make it run on top of tokio's runtime.

3.3. Implementing The Health Check Handler

We have reviewed all the moving pieces in actix_web's Hello World! example: HttpServer, App, route and tokio::main.
We definitely know enough to modify the example to get our health check working as we expect: return a 200 OK response with no body when we receive a GET request at /health_check.

Let's look again at our starting point:

//! src/main.rs
use actix_web::{web, App, HttpRequest, HttpServer, Responder};

async fn greet(req: HttpRequest) -> impl Responder {
    let name = req.match_info().get("name").unwrap_or("World");
    format!("Hello {}!", &name)
}

#[tokio::main]
async fn main() -> std::io::Result<()> {
    HttpServer::new(|| {
        App::new()
            .route("/", web::get().to(greet))
            .route("/{name}", web::get().to(greet))
    })
    .bind("127.0.0.1:8000")?
    .run()
    .await
}

First of all we need a request handler. Mimicking greet we can start with this signature:

async fn health_check(req: HttpRequest) -> impl Responder {
    todo!()
}

We said that Responder is nothing more than a conversion trait into an HttpResponse. Returning an instance of HttpResponse directly should work then!
Looking at its documentation we can use HttpResponse::Ok to get a HttpResponseBuilder primed with a 200 status code. HttpResponseBuilder exposes a rich fluent API to progressively build out a HttpResponse response, but we do not need it here: we can get a HttpResponse with an empty body by calling finish on the builder.
Gluing everything together:

async fn health_check(req: HttpRequest) -> impl Responder {
    HttpResponse::Ok().finish()
}

A quick cargo check confirms that our handler is not doing anything weird. A closer look at HttpResponseBuilder unveils that it implements Responder as well - we can therefore omit our call to finish and shorten our handler to:

async fn health_check(req: HttpRequest) -> impl Responder {
    HttpResponse::Ok()
}

The next step is handler registration - we need to add it to our App via route:

App::new()
    .route("/health_check", web::get().to(health_check))

Let's look at the full picture:

//! src/main.rs

use actix_web::{web, App, HttpRequest, HttpResponse, HttpServer, Responder};

async fn health_check(req: HttpRequest) -> impl Responder {
    HttpResponse::Ok()
}

#[tokio::main]
async fn main() -> std::io::Result<()> {
    HttpServer::new(|| App::new().route("/health_check", web::get().to(health_check)))
        .bind("127.0.0.1:8000")?
        .run()
        .await
}

cargo check runs smoothly although it raises one warning:

warning: unused variable: `req`
 --> src/main.rs:3:23
  |
3 | async fn health_check(req: HttpRequest) -> impl Responder {
  |                       ^^^ help: if this is intentional, prefix it with an underscore: `_req`
  |
  = note: `#[warn(unused_variables)]` on by default

Our health check response is indeed static and does not use any of the data bundled with the incoming HTTP request (routing aside). We could follow the compiler's advice and prefix req with an underscore... or we could remove that input argument entirely from health_check:

async fn health_check() -> impl Responder {
    HttpResponse::Ok()
}

Surprise surprise, it compiles! actix-web has some pretty advanced type magic going on behind the scenes and it accepts a broad range of signatures as request handlers - more on that later.

What is left to do?
Well, a little test!

# Launch the application first in another terminal with `cargo run`
curl -v http://127.0.0.1:8000/health_check

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET /health_check HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.61.0
> Accept: */*
>
< HTTP/1.1 200 OK
< content-length: 0
< date: Wed, 05 Aug 2020 22:11:52 GMT

Congrats, you have just implemented your first working actix_web endpoint!

4. Our First Integration Test

/health_check was our first endpoint and we verified everything was working as expected by launching the application and testing it manually via curl.

Manual testing is time-consuming, though: as our application gets bigger, it gets more and more expensive to manually check that all our assumptions on its behaviour are still valid every time we perform some changes.
We'd like to automate as much as possible: those checks should be run in our CI pipeline every time we are committing a change in order to prevent regressions.

While the behaviour of our health check might not evolve much over the course of our journey, it is a good starting point to get our testing scaffolding properly set up.

4.1. How Do You Test An Endpoint?

An API is a means to an end: a tool exposed to the outside world to perform some kind of task (e.g. store a document, publish an email, etc.).
The endpoints we expose in our API define the contract between us and our clients: a shared agreement about the inputs and the outputs of the system, its interface.

The contract might evolve over time and we can roughly picture two scenarios:

backwards-compatible changes (e.g. adding a new endpoint);
breaking changes (e.g. removing an endpoint or dropping a field from the schema of its output).

In the first case, existing API clients will keep working as they are.
In the second case, existing integrations are likely to break if they relied on the violated portion of the contract.

While we might intentionally deploy breaking changes to our API contract, it is critical that we do not break it accidentally.

What is the most reliable way to check that we have not introduced a user-visible regression?
Testing the API by interacting with it in the same exact way a user would: performing HTTP requests against it and verifying our assumptions on the responses we receive.

This is often referred to as black box testing: we verify the behaviour of a system by examining its output given a set of inputs without having access to the details of its internal implementation.

Following this principle, we won't be satisfied by tests that call into handler functions directly - for example:

#[cfg(test)]
mod tests {
    use crate::health_check;

    #[tokio::test]
    async fn health_check_succeeds() {
        let response = health_check().await;
        // This requires changing the return type of `health_check`
        // from `impl Responder` to `HttpResponse` to compile
        assert!(response.status().is_success())
    }
}

We have not checked that the handler is invoked on GET requests.
We have not checked that the handler is invoked with /health_check as path.

Changing any of these two properties would break our API contract, but our test would still pass - not good enough.

actix-web provides some conveniences to interact with an App without skipping the routing logic, but there are severe shortcomings to its approach:

migrating to another web framework would force us to rewrite our whole integration test suite. As much as possible, we'd like our integration tests to be highly decoupled from the technology underpinning our API implementation (e.g. having framework-agnostic integration tests is life-saving when you are going through a large rewrite or refactoring!);
due to some actix-web's limitations⁵, we wouldn't be able to share our App startup logic between our production code and our testing code, therefore undermining our trust in the guarantees provided by our test suite due to the risk of divergence over time.

We will opt for a fully black-box solution: we will launch our application at the beginning of each test and interact with it using an off-the-shelf HTTP client (e.g. reqwest).

4.2. Where Should I Put My Tests?

Rust gives you three options when it comes to writing tests:

next to your code in an embedded test module, e.g.

// Some code I want to test

#[cfg(test)]
mod tests {
    // Import the code I want to test
    use super::*;
    
    // My tests
}

in an external tests folder, i.e.

> ls

src/
tests/
Cargo.toml
Cargo.lock

as part of your public documentation (doc tests), e.g.

/// Check if a number is even.
/// ```rust
/// use zero2prod::is_even;
/// 
/// assert!(is_even(2));
/// assert!(!is_even(1));
/// ```
pub fn is_even(x: u64) -> bool {
    x % 2 == 0
}

What is the difference?
An embedded test module is part of your project, just hidden behind a configuration conditional check, #[cfg(test)]. Anything under the tests folder and your documentation tests, instead, are compiled in their own separate binaries.
This has consequences when it comes to visibility rules.

An embedded test module has privileged access to the code living next to it: it can interact with structs, methods, fields and functions that have not been marked as public and would normally not be available to a user of our code if they were to import it as a dependency of their own project.
Embedded test modules are quite useful for what I call iceberg projects, i.e. the exposed surface is very limited (e.g. a couple of public functions), but the underlying machinery is much larger and fairly complicated (e.g. tens of routines). It might not be straight-forward to exercise all the possible edge cases via the exposed functions - you can then leverage embedded test modules to write unit tests for private sub-components to increase your overall confidence in the correctness of the whole project.

Tests in the external tests folder and doc tests, instead, have exactly the same level of access to your code that you would get if you were to add your crate as a dependency in another project. They are therefore used mostly for integration testing, i.e. testing your code by calling it in the same exact way a user would.

Our email newsletter is not a library, therefore the line is a bit blurry - we are not exposing it to the world as a Rust crate, we are putting it out there as an API accessible over the network.
Nonetheless we are going to use the tests folder for our API integration tests - it is more clearly separated and it is easier to manage test helpers as sub-modules of an external test binary.

4.3. Changing Our Project Structure For Easier Testing

We have a bit of housekeeping to do before we can actually write our first test under /tests.
As we said, anything under tests ends up being compiled in its own binary - all our code under test is imported as a crate. But our project, at the moment, is a binary: it is meant to be executed, not to be shared. Therefore we can't import our main function in our tests as it is right now.

If you won't take my word for it, we can run a quick experiment:

# Create the tests folder
mkdir -p tests

Create a new tests/health_check.rs file with

//! tests/health_check.rs

use zero2prod::main;

#[test]
fn dummy_test() {
    main()
}

cargo test should fail with something similar to

error[E0432]: unresolved import `zero2prod`
 --> tests/health_check.rs:1:5
  |
1 | use zero2prod::main;
  |     ^^^^^^^^^ use of undeclared type or module `zero2prod`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0432`.
error: could not compile `zero2prod`.

We need to refactor our project into a library and a binary: all our logic will live in the library crate while the binary itself will be just an entrypoint with a very slim main function.
First step: we need to change our Cargo.toml.
It currently looks something like this:

[package]
name = "zero2prod"
version = "0.1.0"
authors = ["Luca Palmieri <contact@lpalmieri.com>"]
edition = "2022"

[dependencies]
# [...]

We are relying on cargo's default behaviour: unless something is spelled out, it will look for a src/main.rs file as the binary entrypoint and use the package.name field as the binary name.
Looking at the manifest target specification, we need to add a lib section to add a library to our project:

[package]
name = "zero2prod"
version = "0.1.0"
authors = ["Luca Palmieri <contact@lpalmieri.com>"]
edition = "2022"

[lib]
# We could use any path here, but we are following the community convention
# We could specify a library name using the `name` field. If unspecified,
# cargo will default to `package.name`, which is what we want.
path = "src/lib.rs"

[dependencies]
# [...]

The lib.rs file does not exist yet and cargo won't create it for us:

cargo check

error: couldn't read src/lib.rs: No such file or directory (os error 2)

error: aborting due to previous error

error: could not compile `zero2prod`

Let's add it then - it can be empty for now.

touch src/lib.rs

Everything should be working now: cargo check passes and cargo run still launches our application.
Although it is working, our Cargo.toml file now does not give you at a glance the full picture: you see a library, but you don't see our binary there. Even if not strictly necessary, I prefer to have everything spelled out as soon as we move out of the auto-generated vanilla configuration:

[package]
name = "zero2prod"
version = "0.1.0"
authors = ["Luca Palmieri <contact@lpalmieri.com>"]
edition = "2022"

[lib]
path = "src/lib.rs"

# Notice the double square brackets: it's an array in TOML's syntax.
# We can only have one library in a project, but we can have multiple binaries!
# If you want to manage multiple libraries in the same repository
# have a look at the workspace feature - we'll cover it later on.
[[bin]]
path = "src/main.rs"
name = "zero2prod"

[dependencies]
# [...]

Feeling nice and clean, let's move forward.
For the time being we can move our main function, as it is, to our library (named run to avoid clashes):

//! main.rs

use zero2prod::run;

#[tokio::main]
async fn main() -> std::io::Result<()> {
    run().await
}

//! lib.rs

use actix_web::{web, App, HttpResponse, HttpServer};

async fn health_check() -> HttpResponse {
    HttpResponse::Ok().finish()
}

// We need to mark `run` as public.
// It is no longer a binary entrypoint, therefore we can mark it as async
// without having to use any proc-macro incantation.
pub async fn run() -> std::io::Result<()> {
    HttpServer::new(|| App::new().route("/health_check", web::get().to(health_check)))
        .bind("127.0.0.1:8000")?
        .run()
        .await
}

Alright, we are ready to write some juicy integration tests!

4.4. Implementing Our First Integration Test

Our spec for the health check endpoint was:

When we receive a GET request for /health_check we return a 200 OK response with no body.

Let's translate that into a test, filling in as much of it as we can:

//! tests/health_check.rs

// `tokio::test` is the testing equivalent of `tokio::main`.
// It also spares you from having to specify the `#[test]` attribute.
//
// You can inspect what code gets generated using 
// `cargo expand --test health_check` (<- name of the test file)
#[tokio::test]
async fn health_check_works() {
    // Arrange
    spawn_app().await.expect("Failed to spawn our app.");
    // We need to bring in `reqwest` 
    // to perform HTTP requests against our application.
    let client = reqwest::Client::new();

    // Act
    let response = client
            .get("http://127.0.0.1:8000/health_check")
            .send()
            .await
            .expect("Failed to execute request.");

    // Assert
    assert!(response.status().is_success());
    assert_eq!(Some(0), response.content_length());
}

// Launch our application in the background ~somehow~
async fn spawn_app() -> std::io::Result<()> {
    todo!()
}

#! Cargo.toml
# [...]
# Dev dependencies are used exclusively when running tests or examples
# They do not get included in the final application binary!
[dev-dependencies]
reqwest = "0.11"
# [...]

Take a second to really look at this test case.
spawn_app is the only piece that will, reasonably, depend on our application code.
Everything else is entirely decoupled from the underlying implementation details - if tomorrow we decide to ditch Rust and rewrite our application in Ruby on Rails we can still use the same test suite to check for regressions in our new stack as long as spawn_app gets replaced with the appropriate trigger (e.g. a bash command to launch the Rails app).

The test also covers the full range of properties we are interested to check:

the health check is exposed at /health_check;
the health check is behind a GET method;
the health check always returns a 200;
the health check's response has no body.

If this passes we are done.

The test as it is crashes before doing anything useful: we are missing spawn_app, the last piece of the integration testing puzzle.
Why don't we just call run in there? I.e.

//! tests/health_check.rs
// [...]

async fn spawn_app() -> std::io::Result<()> {
    zero2prod::run().await
}

Let's try it out!

cargo test

     Running target/debug/deps/health_check-fc74836458377166

running 1 test
test health_check_works ... test health_check_works has been running for over 60 seconds

No matter how long you wait, test execution will never terminate. What is going on?

In zero2prod::run we invoke (and await) HttpServer::run. HttpServer::run returns an instance of Server - when we call .await it starts listening on the address we specified indefinitely: it will handle incoming requests as they arrive, but it will never shutdown or "complete" on its own.
This implies that spawn_app never returns and our test logic never gets executed.

We need to run our application as a background task.
tokio::spawn comes quite handy here: tokio::spawn takes a future and hands it over to the runtime for polling, without waiting for its completion; it therefore runs concurrently with downstream futures and tasks (e.g. our test logic).

Let's refactor zero2prod::run to return a Server without awaiting it:

//! src/lib.rs

use actix_web::{web, App, HttpResponse, HttpServer};
use actix_web::dev::Server;

async fn health_check() -> HttpResponse {
    HttpResponse::Ok().finish()
}

// Notice the different signature!
// We return `Server` on the happy path and we dropped the `async` keyword
// We have no .await call, so it is not needed anymore.
pub fn run() -> Result<Server, std::io::Error> {
    let server = HttpServer::new(|| App::new().route("/health_check", web::get().to(health_check)))
        .bind("127.0.0.1:8000")?
        .run();
    // No .await here!
    Ok(server)
}

We need to amend our main.rs accordingly:

//! src/main.rs

use zero2prod::run;

#[tokio::main]
async fn main() -> std::io::Result<()> {
    // Bubble up the io::Error if we failed to bind the address
    // Otherwise call .await on our Server
    run()?.await
}

A quick cargo check should reassure us that everything is in order.
We can now write spawn_app:

//! tests/health_check.rs
// [...]

// No .await call, therefore no need for `spawn_app` to be async now.
// We are also running tests, so it is not worth it to propagate errors:
// if we fail to perform the required setup we can just panic and crash
// all the things.
fn spawn_app() {
    let server = zero2prod::run().expect("Failed to bind address");
    // Launch the server as a background task
    // tokio::spawn returns a handle to the spawned future,
    // but we have no use for it here, hence the non-binding let
    let _ = tokio::spawn(server);
}

Quick adjustment to our test to accommodate the changes in spawn_app's signature:

//! tests/health_check.rs
// [...]

#[tokio::test]
async fn health_check_works() {
    // No .await, no .expect
    spawn_app();
    // [...]
}

It's time, let's run that cargo test command!

cargo test

     Running target/debug/deps/health_check-a1d027e9ac92cd64

running 1 test
test health_check_works ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Yay! Our first integration test is green!
Give yourself a pat on the back on my behalf for the second major milestone in the span of a single chapter.

4.5. Polishing

We got it working, now we need to have a second look and improve it, if needed or possible.

4.5.1. Clean Up

What happens to our app running in the background when the test run ends? Does it shut down? Does it linger as a zombie somewhere?

Well, running cargo test multiple times in a row always succeeds - a strong hint that our 8000 port is getting released at the end of each run, therefore implying that the application is correctly shut down.
A second look at tokio::spawn's documentation supports our hypothesis: when a tokio runtime is shut down all tasks spawned on it are dropped. tokio::test spins up a new runtime at the beginning of each test case and they shut down at the end of each test case.
In other words, good news - no need to implement any clean up logic to avoid leaking resources between test runs.

4.5.2. Choosing A Random Port

spawn_app will always try to run our application on port 8000 - not ideal:

if port 8000 is being used by another program on our machine (e.g. our own application!), tests will fail;
if we try to run two or more tests in parallel only one of them will manage to bind the port, all others will fail.

We can do better: tests should run their background application on a random available port.
First of all we need to change our run function - it should take the application address as an argument instead of relying on a hard-coded value:

//! src/lib.rs
// [...]

pub fn run(address: &str) -> Result<Server, std::io::Error> {
    let server = HttpServer::new(|| App::new().route("/health_check", web::get().to(health_check)))
        .bind(address)?
        .run();
    Ok(server)
}

All zero2prod::run() invocations must then be changed to zero2prod::run("127.0.0.1:8000") to preserve the same behaviour and get the project to compile again.

How do we find a random available port for our tests?
The operating system comes to the rescue: we will be using port 0.
Port 0 is special-cased at the OS level: trying to bind port 0 will trigger an OS scan for an available port which will then be bound to the application.

It is therefore enough to change spawn_app to

//! tests/health_check.rs
// [...]

fn spawn_app() {
    let server = zero2prod::run("127.0.0.1:0").expect("Failed to bind address");
    let _ = tokio::spawn(server);
}

Done - the background app now runs on a random port every time we launch cargo test!
There is only a small issue... our test is failing⁶!

running 1 test
test health_check_works ... FAILED

failures:

---- health_check_works stdout ----
thread 'health_check_works' panicked at 'Failed to execute request.: reqwest::Error { kind: Request, url: "http://localhost:8000/health_check", source: hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })) }', tests/health_check.rs:10:20
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Panic in Arbiter thread.


failures:
    health_check_works

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

Our HTTP client is still calling 127.0.0.1:8000 and we really don't know what to put there now: the application port is determined at runtime, we cannot hard code it there.
We need, somehow, to find out what port the OS has gifted our application and return it from spawn_app.

There are a few ways to go about it - we will use a std::net::TcpListener.
Our HttpServer right now is doing double duty: given an address, it will bind it and then start the application. We can take over the first step: we will bind the port on our own with TcpListener and then hand that over to the HttpServer using listen.

What is the upside?
TcpListener::local_addr returns a SocketAddr which exposes the actual port we bound via .port().

Let's begin with our run function:

//! src/lib.rs

use actix_web::dev::Server;
use actix_web::{web, App, HttpResponse, HttpServer};
use std::net::TcpListener;

// [...]

pub fn run(listener: TcpListener) -> Result<Server, std::io::Error> {
    let server = HttpServer::new(|| App::new().route("/health_check", web::get().to(health_check)))
        .listen(listener)?
        .run();
    Ok(server)
}

The change broke both our main and our spawn_app function. I'll leave main to you, let's focus on spawn_app:

//! tests/health_check.rs
// [...]

fn spawn_app() -> String {
    let listener = TcpListener::bind("127.0.0.1:0").expect("Failed to bind random port");
    // We retrieve the port assigned to us by the OS
    let port = listener.local_addr().unwrap().port();
    let server = zero2prod::run(listener).expect("Failed to bind address");
    let _ = tokio::spawn(server);
    // We return the application address to the caller!
    format!("http://127.0.0.1:{}", port)
}

We can now leverage the application address in our test to point our reqwest::Client:

//! tests/health_check.rs
// [...]

#[tokio::test]
async fn health_check_works() {
    // Arrange
    let address = spawn_app();
    let client = reqwest::Client::new();

    // Act
    let response = client
        // Use the returned application address
        .get(&format!("{}/health_check", &address))
        .send()
        .await
        .expect("Failed to execute request.");

    // Assert
    assert!(response.status().is_success());
    assert_eq!(Some(0), response.content_length());
}

All is good - cargo test comes out green. Our setup is much more robust now!

5. Next Up

We covered a fair amount of ground: we have a project skeleton, our integration tests are wired up and we have developed a solid understanding of actix-web's fundamentals.

According to our strategy we should now jump straight into the implementation of /subscribe.
We will instead call it a day - I'll follow the same philosophy I am advocating for: freeze the budget, not the feature set.
In other words, instead of being adamant on scope (e.g. Chapter 3 has to cover all these topics), I am being firm on timelines (e.g. a new article of Zero To Production should come out every two weeks, on time like a clock).

In the second half of Chapter 3 you can expect:

an overview of available libraries to interact with PostgresSQL in the Rust ecosystem;
a strategy (and tooling) to manage database migrations;
how to check side-effects in integration tests;
a demo of newsletter sign-ups using our new /subscribe endpoint.

See you in two weeks!

Thanks to james2509, ThePesta, mighty_soft_tofu, vertexclique and Federica Via for taking the time to review the draft of this article.

This article is a sample from Zero To Production In Rust, a hands-on introduction to backend development in Rust.
You can get a copy of the book at zero2prod.com.

Footnotes

Click to expand!

During our development process we are not always interested in producing a runnable binary: we often just want to know if our code compiles or not. cargo check was born to serve exactly this usecase: it runs the same checks that are run by cargo build, but it does not bother to perform any machine code generation. It is therefore much faster and provides us with a tighter feedback loop. See link for more details.

cargo follows the same philosophy of Rust's standard library: where possible, the addition of new functionality is explored via third-party crates and then upstreamed where it makes sense to do so (e.g. cargo-vendor).

impl Responder is using the impl Trait syntax introduced in Rust 1.26 - you can find more details here.

⁴

Check out the release notes of async/await for more details. The talk by withoutboats at Rust LATAM 2019 is another excellent reference on the topic. If you prefer books to talks, check out Futures Explained in 200 Lines of Rust.

⁵

App is a generic struct and some of the types used to parametrise it are private to the actix_web project. It is therefore impossible (or, at least, so cumbersome that I have never succeeded at it) to write a function that returns an instance of App.

⁶

There is a remote chance that the OS ended up picking 8000 as random port and everything worked out smoothly. Cheers to you lucky reader!

Book - Table Of Contents