Pavex DevLog #6: designing safe and ergonomic middlewares

September 01, 2023

3388 words

17 min

👋 Hi!
It's Luca here, the author of "Zero to production in Rust".
This is progress report about Pavex, a new Rust web framework that I have been working on. It is currently in the early stages of development, working towards its first alpha release.

Check out the announcement post to learn more about the vision!

Overview

It's time for another progress report on Pavex, covering what has been done in July and August!

I've been hard at work with one objective: adding middleware support to Pavex.
It's far from polished at this point, but it (finally) works 🚀

I'll use this report as a chance to deep-dive on a few topics:

Why middleware support is a key requirement for production-readiness
The challenges of designing a middleware system for a Rust web framework
Pavex's middleware design
The limitations of our current implementation

But if you're short on time, here's a simple timeout middleware to showcase what a middleware looks like in Pavex:

use pavex::{middleware::Next, response::Response};
use std::future::IntoFuture;
use tokio::time::{timeout, error::Elapsed};

pub async fn timeout_middleware<C>(
    // A handle on the rest of the processing pipeline for the incoming
    // request. 
    //
    // Middlewares can choose to short-circuit the execution (i.e. 
    // return an error and abort the processing) or perform some 
    // computation and then delegate to `next` by awaiting it.
    //
    // All middlewares in Pavex *must* take `Next<_>` as an input.
    next: Next<C>,
    // Middlewares can take advantage of dependency injection! 
    // You just list what inputs you need and the framework will provide them
    // (if possible, or return a nice error at *compile-time* if not).
    //
    // `TimeoutConfig` could be defined at start-up time, sharing the same 
    // value for all routes, or it could be customised at runtime on a 
    // per-request basis (e.g. to provide a configurable quality of service
    // depending on the pricing plan of the client issueing the request).
    // `timeout_middleware` doesn't care how or when `TimeoutConfig`
    // gets computed. 
    // Happy decoupling!
    config: TimeoutConfig
) -> Result<Response, Elapsed>
where
    C: IntoFuture<Output = Response>
{
    timeout(config.request_timeout, next.into_future()).await
}

You can then add this middleware to your request chain by calling Blueprint::wrap:

use pavex::{blueprint::Blueprint, f};

pub fn api() -> Blueprint {
    let mut bp = Blueprint::new();
    // [...]
    bp.wrap(f!(crate::timeout_middleware));
    bp
}

Pretty straight-forward, isn't it?

You can discuss this update on r/rust.

Middlewares

Why do we need middlewares?

Every successful project grows in complexity over time.
What started out as a couple of API endpoints often evolves into a large application, with tens if not hundreds of request handlers.

While the business logic may vary wildly from one request handler to the next, some concerns are usually shared. To mention a few common examples:

Enforcing a timeout on the processing of all incoming requests.
We don't want to waste server resources on requests that the caller has already given up on. It also protects us from a certain type of denial of service attacks.
Ensuring that the caller is authenticated.
Authorization logic might be deeply entertwined with the business logic, but we usually deploy the same kind of authentication (e.g. API tokens, request signing) across all sensitive endpoints exposed by an API.
Emitting telemetry data.
There should be at least one log record for each incoming request, capturing some basic HTTP-level information (e.g. path, method, response status code, etc.) on top of some business-level attributes you might care about. You might also have metrics that need to be updated for every request (e.g. to track error rate!).

The challenge is twofold:

We don't want to duplicate this logic in each of our request handlers.
We want to make sure that we are using the same logic for all routes.

You could satisfy 1. by extracting the common logic in a function and invoking it in all your request handlers. But that won't cut it for 2.: every time you're adding a new request handler, you need to actively remember to add those invocations.

That's the challenge that the middleware pattern tries to solve. We no longer deal with those concerns inside the bodies of our request handlers. We invoke the relevant logic either before or after the request handlers kick in.
All middlewares are declared once, centrally, and that's the only place we need to look at if we want to make changes. Every time a new route is added, it is automatically covered by the existing middlewares for that set of paths.

The shape of a middleware

What do middlewares look like, in pratice?
The details vary depending on the programming language and the specific framework you're looking at, but middleware interfaces broadly fall into two categories:

Lifecycle hooks
Pipeline wrappers

When it comes to Rust, pipeline wrappers are fundamentally more capable than lifecycle hooks. I'll quickly show you the two interfaces and explain why.

Lifecycle hooks

You register a function or a method with the framework, asking for it to be invoked at a specific point in the request lifecycle—e.g. before the request handler is invoked, or after. These are often referred to as "callbacks" in some ecosystems.

The details may vary depending on the framework, but a reference interface would look somewhat like this in Rust:

trait IncomingRequestHook {
    async fn invoke(&self, request: &mut Request) -> Outcome;
}

pub enum Outcome {
    ContinueProcessing,
    EarlyReturn(Response),
}

I've ignored the fallibility angle for simplicity.

The key detail: the invoke method has no way to directly manipulate (or invoke) the rest of the processing pipeline (i.e. later middlewares or the request handler itself). It makes the decision, but defers the execution to the framework machinery.

Pipeline wrappers

Pipeline wrappers are the exact opposite: they wrap around the remaining part of the processing pipeline, and they are fully in control of its invocation (or lack thereof).
In pseudo-code, ignoring fallibility and other details again:

trait WrappingMiddleware {
    async fn wrap(&self, request: Request, next: Next) -> Response;
}

Hooks have no `Future`s

How do you go about implementing a timeout middleware? Or a logging middleware that keeps track of the actual processing time spent executing your business logic?

You necessarily need access to the Future that represents the computation that you want to decorate.
If you want to invoke tokio::time::timeout on your processing pipeline, you need to pass it as input the Future that's going to drive that processing pipeline to completion:

timeout(timeout_duration, next_future).await

You can't implement the same logic using a lifecycle hook, since it doesn't get access to that Future type.
The same applies to logging: to attach a tracing Span to your pipeline, you need to be able to call instrument on its Future:

next_future.instrument(my_span).await

Lifecycle hooks fall short once again.

As a framework author, it leaves you with two options:

Lift all the functionality that requires access to that Future into the framework
Empower users to write their own solutions, using wrapping-style middlewares

If you look at the ecosystem, you find wrapping-style middlewares in all major Rust frameworks:

Transform and Service in actix-web
Layer and Service in axum
Middleware and Endpoint in poem

rocket is the only exception, using the callback-style with its Fairings. As a consequence, both logging and timeouts have to be provided as framework built-ins. As far as I could see, there is no way to customize them or bring your own.

What about Pavex?
The goal is to build a batteries-included framework, providing most of the functionality you need to build a production-ready application as first-party code. At the same time, you shouldn't be forced to use our solutions. Depending on your requirements, they might be inadequate.
Wrapping-style is the way forward for Pavex: you'll be able to use our solutions, but chip in with your own if needed.

Interface design

We have settled on an overall approach, but the devil is in the details. What does the middleware interface actually look like?

You have to make three consequential choices:

Overhead. Does the interface need to be a zero-cost abstraction?
Generics. How many generic parameters are too many?
State. How does the middleware access shared state?

Overhead

Rust's async story is still maturing. As of today, we lack first-class support for async functions in traits.
You need to fully spell it out to the compiler: you have write to a synchronous function that returns a type that implements the Future trait.

You can see what this looks like in tower's Service trait definition:

pub trait Service<Request> {
    type Response;
    type Error;
    type Future: Future
    where
        <Self::Future as Future>::Output == Result<Self::Response, Self::Error>;

    fn call(&mut self, req: Request) -> Self::Future;

    // [...]
}

A long-winded way to write async fn call (&mut self, req: Request) -> Result<Self::Response, Self::Error>, the best you can do today in stable Rust if you don't want to compromise on performance.
As an alternative, you could use the async_trait crate: its #[async_trait] lets you write "async" function in traits today:

#[async_trait::async_trait]
pub trait Service<Request> {
    type Response;
    type Error;

    async fn call(&mut self, req: Request) -> Result<Self::Response, Self::Error>;

    // [...]
}

☝️ works in stable Rust, today. But it comes at a cost—it automatically adds a layer of indirection, since it automatically Boxes the async body of your call implementation.
That indirection has a performance impact—negligible in most cases, but nonetheless there. The most popular Rust web frameworks strive to have no overhead whatsoever, therefore they shy away from solutions that involve auto-boxing. They might offer it as a "simplified" interface, alongside the zero-overhead one. See axum's from_fn or actix-web's from_fn.

There's another constraint though: the Future returned by your method must be nameable. What's the name of the future returned by:

async fn handler(request: Request) -> Response {
    // [...]
}

It isn't Response. We can say, at best, impl Future<Output=Response>, but you can't use that as an associated type in a trait implementation! This restriction will eventually be lifted as well (see this RFC), but it often forces you to box anyway even if you're implementing the zero-overhead version of an async trait.

Generics

Let's assume that async functions in traits have been shipped and we can enjoy a "slimmer" version of tower's Service trait:

pub trait Service<Request> {
    type Response;
    type Error;

    async fn call(&mut self, req: Request) -> Result<Self::Request, Self::Error>;

    // [...]
}

For each middleware implementation, we need to make three choice:

What bounds should we enforce on the generic Request parameter?
What should our Response type be?
What should our Error type be?

Let's compare it with the Middleware trait from tide:

pub trait Middleware<State>: Send + Sync + 'static {
    async fn handle(&self, request: Request<State>, next: Next<'_, State>) -> Result<Response, Error>;
    // [...]
}

Ignore the State parameter, we'll get to state management in the next section.
Apart from State, there are no generics: you take a concrete Request type as input and are expected to return a concrete Response type or a concrete Error type—all defined by the framework.

It's definitely more intuitive: you can just focus on the actual middleware logic rather than fidgeting around with trait bounds. Ergonomicity comes at a cost though: in order to use a concrete Response type, tide forces you to Box the body of your response.

Tradeoffs, tradeoffs everywhere.

State

Last but not least: state!
There are two types of state in a request processing pipeline:

Application state.
It's usually built before starting the application and it's long-lived: it stays around until the application shuts down.
Example: a database connection pool.
Request-scoped state.
It is computed on a per-request basis and it's used by multiple steps of the request processing pipeline.
Example: the session state for a logged-in user, extracted in a middleware and later used in the request handler itself.

axum and tide follow the same approach when it comes to application state: it becomes a generic parameter on the overall application type. They lean on the compiler to make sure that middlewares and request handlers expect the "right type." It comes with a downside: you need to have a global type to represent the entirety of the state. Not too bad, if you ask my opinion.

Request-scoped state is another story though.
Most (all?) major Rust web frameworks throw compile-time checks out of the window and rely on a request-scoped type map (often called "request extensions") for managing request-scoped state. At a glance, it looks like this: HashMap<TypeId, Box<dyn Any + [...]>, [...]>.
When you build a piece of request-local state that you want to pass down (or up) the processing chain, you insert it into the map.
When you need a piece of request-local state that you haven't built yourself, you get it from the map—and hope really hard that it's there.

Request extensions are the bane of my existence when working on backend applications in Rust: they are the most common (and annoying) source of bugs when trying to add a new middleware or updating your existing stack. The coupling they create is hidden to the compiler—you only find out at runtime if you misconfigured something.

You need to carefully read the docs of each middleware you are planning to use in order to make sure that you have installed (and applied in the correct order) any other middleware they might depend on.

I'm strongly determined to have no untyped side-channel for state in Pavex.

Pavex's design

At this stage, I hope I convinced you that there are no easy choices when it comes to designing a middleware interface.
But enough with the excuses—I'll walk you through the design I landed on for Pavex.

Look ma, no traits!

We side-step the trait problem entirely: Pavex relies on its own compile-time reflection engine to determine what fits or doesn't fit its interfaces.
These are the requirements:

The middleware is an asynchronous function (or method).
It takes as input Next<C>, a handle on the request pipeline it's wrapping.
It either returns a pavex::Response or a Result<pavex::Response, E>, where E is an error type.
It can take additional input parameters, as long as you register a constructor for them.

Going back to our timeout example, this is what a timeout middleware looks like in Pavex:

use pavex::{middleware::Next, response::Response};
use std::future::IntoFuture;
use tokio::time::{timeout, error::Elapsed};

pub async fn timeout_middleware<C>(next: Next<C>, config: TimeoutConfig) -> Result<Response, Elapsed>
where
    C: IntoFuture<Output = Response>
{
    timeout(config.request_timeout, next.into_future()).await
}

Explicit state dependencies

Let's zoom in on TimeoutConfig, our second input parameter: is it request-scoped? Or is it part of the application state?
It doesn't matter for Pavex: all dependencies on a state (no matter the lifecycle) must be encoded in the signature of either middlewares, request handlers or other constructors.
If you forget to register a constructor for a piece of state, Pavex will remind you about it at compile-time:

ERROR:
  × I can't invoke your wrapping middleware, `timeout`, because it needs an instance of
  │ `TimeoutConfig` as input, but I can't find a constructor for that type.
  │
  │     ╭─[src/blueprint.rs:18:1]
  │  18 │
  │  19 │     bp.wrap(f!(crate::timeout));
  │     ·             ────────┬────────
  │     ·                     ╰── The wrapping middleware was registered here
  │  20 │
  │     ╰────
  │    ╭─[src/load_shedding.rs:5:1]
  │  5 │
  │  6 │ pub async fn timeout<T>(next: Next<T>, timeout_config: TimeoutConfig) -> Response
  │    ·                                                        ──────┬──────
  │    ·                I don't know how to construct an instance of this input parameter
  │    ╰────
  │   help: Register a constructor for `TimeoutConfig`

Fixing the error is easy enough:

use pavex::{
    blueprint::{Blueprint, constructor::Lifecycle}, 
    f
};

pub fn api() -> Blueprint {
    let mut bp = Blueprint::new();
    // [...]
    // Register a constructor for `TimeoutConfig` and declare
    // it is going to live as long as the application itself,
    // a singleton!
    bp.constructor(
        f!(crate::TimeoutConfig::from_config),
        Lifecycle::Singleton,
    );
    bp
}

If you prefer to determine the timeout on a per-request basis, you can register a constructor with a RequestScoped lifecycle instead:

use pavex::{
    blueprint::{Blueprint, constructor::Lifecycle}, 
    f
};

pub fn api() -> Blueprint {
    let mut bp = Blueprint::new();
    // [...]
    // Register a constructor for `TimeoutConfig` and declare
    // it is going to be scoped to a request lifecycle.
    bp.constructor(
        f!(crate::TimeoutConfig::from_request),
        Lifecycle::RequestScoped,
    );
    bp
}

This design isn't without downsides: you can't "create" state from inside a middleware and then share it with a downstream step of the processing pipeline.
You are forced to extract the creation of that state into a separate constructor with a RequestScoped lifecycle and let the framework inject it as an input parameter in both locations. I believe it's going to prove a useful forcing function towards isolated components with well-defined responsibilities, but we'll only know when Pavex is taken out for a spin by a few users.

Concrete types over generics

The eagle-eyed among you will have noticed our last design choice along the axes we examined: Pavex uses a concrete response type, pavex::response::Response. As discussed previously, this introduces a level of indirection—the response body is always boxed.
The type of applications I'm designing for in Pavex (backend-for-the-frontend, enterprise microservices, monoliths) rarely suffer from performance issues due to an extra level of pointer indirection: you need to look at data modelling, database access patterns, caching, etc.
On this topic, I lean towards getting a productivity boost rather than raising the performance ceiling by a millimeter.

What's next

The current middleware implementation is definitely an MVP. In particular:

We currently box the Future that represents the rest of the request processing pipeline in Next<_>.
The concrete type we generate for C in Next<C> (one for each route+middleware pair) can't take any lifetime parameter at the moment. This prevents you from working with state that borrows from the incoming request or the long-lived application state.

Both of these are limitations of the current implementation: they made it easier to get to a working prototype, but they aren't fundamental design defects. I'll be improving the code generator to remove these issues over the coming month.

Once 2. is solved though, we'll hit another problem: you can't borrow non-static data across an await point if you're using hyper in its "default" configuration (i.e. work-stealing multi-threaded HTTP server). I'll be moving Pavex towards a thread-per-core design to lift this restriction—I'll explain the rationale further in the future.

That's all for August, see you next month!