ndarray::Array1

April 07, 2019

5433 words

28 min

I can illustrate the [...] approach with the [...] image of a nut to be opened. The first analogy that came to my mind is of immersing the nut in some softening liquid, and why not simply water? From time to time you rub so the liquid penetrates better, and otherwise you let time pass. The shell becomes more flexible through weeks and months — when the time is ripe, hand pressure is enough, the shell opens like a perfectly ripened avocado! A different image came to me a few weeks ago. The unknown thing to be known appeared to me as some stretch of earth or hard marl, resisting penetration [...] the sea advances insensibly in silence, nothing seems to happen, nothing moves, the water is so far off you hardly hear it [...] yet finally it surrounds the resistant substance. Alexander Grothendieck

Scientific Computing: A Rust adventure (TOC)

Part 0: Vectors
Part 1: Zero-cost abstractions
Part 2: Array1 (this post)

Checkpoint

In the previous episode we introduced several key concepts concerning Rust's type system: generics, traits, operators, associated types, Copy. Empowered by this new knowledge, we were able to write a generic vector product, suitable to manipulate several numerical types under very mild constraints.

In this episode we will finally start using ndarray: we will introduce Array1, its one-dimensional array, and we will spend some time on closures and function traits. Our driving example is going to be numerical integration: we will write a very simple routine to compute the integral of well-behaved real-valued functions over closed 1-dimensional intervals.

Numerical integration

In hindsight, nothing gave me more pleasure as a mathematician than reframing problems. You have this thorny issue you are working on - all techniques you are familiar with lead you nowhere, you are not moving forward. Then you start looking at the structures you are working with diagonally, from a weird angle. You rotate them in your mental eye, until it seems you are hallucinating: your objects looks like something else, something you know, something you have met somewhere else, something familiar, but different. You can reframe your problem using new words, new structures, and those new words carry with them other tools, other theorems, other strategies you can use to finally get to the bottom of your problem. It's thrilling, you feel elated.¹

This blog post will probably fail by a long shot to deliver such a powerful impact, but I promise you we will spending most of our time dealing with a problem that is crucial to a lot of wildly different applications (inside and outside of math): numerical integration.

Two noticeable examples:

ordinary differential equations - they are used to describe a wild variety of physical phenomena and their resolution can be framed as the computation of an integral;
Bayesian inference - (nasty) integrals come in big time when trying to evaluate posterior probability distributions.

We won't be touching any state-of-art algorithm for numerical integration². We will implement an elementary routine - a good occasion to introduce the core focus of this whole series: the ndarray crate. We will also introduce some new Rust language features to take full advantage of what ndarray provides us.

The rectangle rule

Our goal is to compute

$$ \int_a^b{f(x),\text{d}x} $$ where $a$ and $b$ are finite real numbers, $a < b$, and $f$ is a function of one real variable, i.e. $f: \mathbb{R}\to\mathbb{R}$. The simplest approximation method is called rectangle rule (or midpoint rule):

pick an integer $n$;
divide the $[a, b]$ interval into $n$ sub-intervals of equal length; $$ a < a+\frac{1}{n}(b-a) < a+\frac{2}{n}(b-a) < \dots < a+\frac{n-1}{n}(b-a) < a+\frac{n}{n}(b-a)=b $$
for each sub-interval $I_i$, build a rectangle $R_i$ with $I_i$ as basis and the function value in the midpoint of the $I_i$ sub-interval as height, $f(\frac{x_i + x_{i+1}}{2})$ ;
estimate the integral $\int_a^b{f(x)\,\text{d}x}$ using the sum of the areas of all rectangles.

The smaller $n$, the better the estimate. Visually, it looks like this (for increasingly bigger values of $n$):

We can get the formula out with a bit of algebra. Let's call $x_i$ and $x_{i+1}$ the extremal points of the $i$ -th sub-interval, $I_i$ . Then, the area of the corresponding rectangle is $$ Area(R_i) = (x_{i+1}-x_{i})f\left(\frac{x_i + x_{i+1}}{2}\right) $$ We built our sub-intervals by partioning $[a, b]$ in equal parts, hence $$ x_i = a+\frac{i}{n}(b-a) $$ Substituting it in the previous formula, we get: $$ \begin{align} Area(R_i)&=(x_{i+1}-x_{i})f\left(\frac{x_i + x_{i+1}}{2}\right) \ &= \left(a+\frac{i+1}{n}(b-a)-\left(a+\frac{i}{n}(b-a)\right)\right)f\left(\frac{a+\frac{i+1}{n}(b-a)+a+\frac{i}{n}(b-a)}{2}\right) \ &= \frac{(b-a)}{n}f\left(a+\frac{2i+1}{2n}(b-a)\right) \end{align} $$ We can now sum all these areas together to get our integral estimate: $$ \begin{align} \int_a^b{f,\text{d}x}&=\sum_{i=0}^{n-1} Area(R_i) \ &= \sum_{i=0}^{n-1} \frac{(b-a)}{n}f\left(a+\frac{2i+1}{2n}(b-a)\right) \ &= \frac{(b-a)}{n} \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}\frac{(b-a)}{n}\right) \ &= L_n \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}L_n\right) \end{align} $$ where $L_n=\frac{(b-a)}{n}$ is just the length of each sub-interval.

It's now time to turn the midpoint rule into a Rust routine.

Array1

Array1<T> is going to be our foothold into ndarray. Array1<T> is roughly equivalent to Vec<T>: it's a one-dimensional array which owns its elements.

We will point out their differences along the way.

Constructors

We can create an instance of Array1<T> using the array macro:

// We import what we need from the `ndarray` crate
use ndarray::{Array1, array};

// The syntax is exactly the same we have already seen
// for the `vec` macro:
// 
// array![<1st element>, <2nd element>, ..., <last element>]
let a: Array1<i32> = array![0, -1, 2, 3];

Alternatively, we can first build an instance of Vec<T> and then convert it into an instance of Array1<T>:

use ndarray::{Array1, array};


let a: Array1<i32> = array![0, -1, 2, 3];
// We start with a vector
let vector = vec![0, -1, 2, 3];
// `from` takes `vector` by value: `vector` is consumed due to move semantics
// We can't reuse `vector` after having performed this conversion
let b: Array1<i32> = Array1::from(vector);
// It does not matter how we built them - the resulting arrays are identical
assert_eq!(a, b);

The first step in computing $$ \int_a^b{f,\text{d}x} = L_n \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}L_n\right) $$ is generating an Array1<f32> containing the sequence of midpoints, i.e. $$ \left{ a+\frac{2i+1}{2}L_n \right}_{i=0}^{n-1} $$ We can do it using a Vec<f32> as intermediate step:

use ndarray::Array1;

// We take as inputs the variables of our problem:
// - n, the number of sub-intervals,
// - a, the left end of the interval
// - b, the right end of the interval
fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
    // Quick check to ensure our inputs are valid
    assert!(a < b);
    assert!(n > 0);
    // Initialise an empty vec
    let mut v = vec![];
    let length = (b - a) / n;
    for i in 0..n {
        // For each index, we compute the midpoint...
        let midpoint = a + (i + 0.5) * length;
        // ..and accumulate it in `v`
        v.push(midpoint); 
    }
    // We return the conversion result
    Array1::from(v)
}

It looks pretty straight-forward, but the compiler is not fully onboard:

error[E0277]: cannot divide `f32` by `u32`
  --> src/main.rs:14:21
   |
14 |     let length = (b - a) / n;
   |                     ^ 
   |    no implementation for `f32 / u32`
   |
   = help: the trait `std::ops::Div<u32>` is 
           not implemented for `f32`

error[E0277]: cannot add `{float}` to `u32`
  --> src/main.rs:17:31
   |
17 |         let midpoint = a + (i + 0.5) * length;
   |                               ^ 
   |     no implementation for `u32 + {float}`
   |
   = help: the trait `std::ops::Add<{float}>` is
           not implemented for `u32`

Rust does not perform implicit conversions (type casting) for numerical types. If the types involved in the computation are not compatible (i.e. an implementation of the corresponding trait is missing, e.g. std::ops::Add<f32> for u32) the compiler will raise an error. Type casting looks fairly innocent at a first glance, but several subtle bugs are lurking behind the corner if it's approached carelessly - I suggest you to have a look at the relevant section in Rust By Example to get an idea of what scenarios are possible.

In our case, it is sufficient to perform an explicit conversion, using the as keyword. The middle section of our midpoints function becomes:

let length = (b - a) / (n as f32);
for i in 0..n {
    // For each index, we compute the midpoint...
    let midpoint = a + ((i as f32) + 0.5) * length;
    // ..and accumulate it in `v`
    v.push(midpoint); 
}

It compiles just fine now and we have our point sequence in an Array1, as desired. It took a little bit of fiddling though, for what looks like a very common use case: generating a sequence of values from a starting point (a + 0.5 * length) using a fixed step (length) until the endpoint is reached (b - 0.5 * length). We are also allocating memory more often than necessary - we know from the very beginning how many elements we are going to store. Array1<T> has a dedicated method to take care of this very common pattern: it's called range (equivalent to arange in NumPy). The previous snippet becomes:

use ndarray::Array1;

// We take as inputs the variables of our problem:
// - n, the number of sub-intervals,
// - a, the left end of the interval
// - b, the right end of the interval
fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
    // Quick check to ensure our inputs are valid
    assert!(a < b);
    assert!(n > 0);
    // Step size
    let length = (b - a) / (n as f32);
    // Starting point (included in the generated array)
    let start = a + 0.5 * length;
    // End point (excluded from the generated array).
    // The last point in the generated sequence is going to be:
    // end - length = b - 0.5 * length
    let end = b + 0.5 * length;
    // The desired sequence of midpoints
    Array1::range(start, end, length)
}

Transformations

We can now generate the sequence of midpoints for an arbitrary choice of a, b and n - the next step to evaluate $$ \int_a^b{f,\text{d}x} = L_n \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}L_n\right) $$ is computing the value of $f$ in each of those points.

Let's try to do it for $f(x) = sin(x)$. This can be done using the map method and a closure:

// Generate the point sequence using our `midpoints` routine
let points = midpoints(129, 0., 1.);
let f_values = points.map(
    |x| x.sin()
);

map borrows an immutable reference to each element in points (type: &f32), it applies the specified closure to it (|x| x.sin()) and returns a new Array1 instance containing the results of this operation. A closure is an anonymous function, Rust's version of Python's lambdas: the function argument is specified between pipes (i.e. |x|) and the output of the function body (x.sin()) is automatically returned. It is also possible to specify more complex function bodies using a block:

// Generate the point sequence using our `midpoints` routine
let points = midpoints(129, 0., 1.);
let f_values = points.map(
    // The beginning of the block is marked by an open curly brace, `{`
    |x| {
        // This works just like a normal function body, taking `x` as input
        let sin = x.sin();
        let cos = x.cos();
        // The last expression, given that there is no trailing `;`,
        // is automatically returned
        cos + sin
    // The end of the block is marked by a closed curly brace, `}`
    }
);

Summing

We are almost there: we just need to sum our f_values and multiply the result by $L_n=(b-a) / n$. It's as easy as it sounds:

let a = 0.;
let b = 1.;
let n = 129;
let points = midpoints(n, a, b);
let f_values = points.map(
    |x| x.sin()
);
let length = (b - a) / (n as f32);
// `Array1` provides a `sum` method
let integral_estimate = f_values.sum() * length;

Ez.

Have we done a good job?

Done... or not? I'd say it would be wise to:

test it on some easy integrals (for which we can compute the answer analytically);
benchmark its performance.

To move forward with both objectives we need a little bit more flexibility. Our routine should become a function: a, b n and the function we want to integrate (|x| x.sin() so far) have to become parameters.

The first three should be fairly easy at this point:

use ndarray::Array1;

fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
    assert!(a < b);
    assert!(n > 0);
    let length = (b - a) / (n as f32);
    let start = a + 0.5 * length;
    let end = b + 0.5 * length;
    Array1::range(start, end, length)
}

// We take as inputs the variables of our problem:
// - n, the number of sub-intervals,
// - a, the left end of the interval
// - b, the right end of the interval
fn rectangle_rule(n: u32, a: f32, b: f32) -> f32 {
    let points = midpoints(n, a, b);
    let f_values = points.map(
        |x| x.sin()
    );
    let length = (b - a) / (n as f32);
    let integral_estimate = f_values.sum() * length;
    integral_estimate
}

What about the integrand function? What is its type?

Yeah, that's a thorny one. If you have been following closely so far, you know what's coming... a bunch of new traits! 🌟

Function traits

If you want to start passing closures around like a pro you have to get familiar with the Fn traits provided by the standard library. But, first of all, let's give a little bit more context around closures.

So far we have been describing closures as anonymous functions. A syntactic convenience, if you want, that allows you to define a function inline without having to go through the declaration boilerplate. Closures can do way more than that!

Closures can capture variables from the environment they are being defined into. Let's see it in action with an example (borrowed from the Book):

fn main() {
    // A variable defined in the `main` scope
    let x = 4;

    // A closure that takes an input parameter (named `z`)
    // and compares it with `x`... but `x` is not an input
    // parameter of `equal_to_x`!
    // `x` comes from the definition environment oof `equal_to_x`:
    // we are capturing its value inside the closure!
    let equal_to_x = |z| z == x;

    let y = 4;

    assert!(equal_to_x(y));
}

It compiles and the assertion is satisfied. If we try to do the same using a proper function...

fn main() {
    // A variable defined in the `main` scope
    let x = 4;

    // We try to define `equal_to_x` as a function
    // using `x` from the `main` scope in its body
    fn equal_to_x(z: usize) -> bool {
        z == x
    }

    let y = 4;

    assert!(equal_to_x(y));
}

...we get a compiler error:

error[E0434]: can't capture dynamic environment in a fn item
 --> src/main.rs:8:14
  |
8 |         z == x
  |              ^
  |
  = help: use the `|| { ... }` closure form instead

Unsurprisingly, the compiler has seen what we were trying to do and suggested us to use a closure!

How does this fit into Rust's ownership system? The capturing mechanism follows the same rules. A variable in the environment can be captured:

as immutable reference;
as mutable reference;
by value, using move semantics.

For each of these cases, there exists a corresponding trait in the standard library, as we anticipated:

Fn is implemented by closures that can be called multiple times without changing the captured variables (they don't consume the captured variables with move semantics and they don't require them to be mutable);
FnMut is implemented by closures that can be called multiple times with the possibility of mutating the captured variables (they don't consume the captured variables with move semantics but they might require them to be mutable);
FnOnce is implemented by closures that might take some of their captured variables by value, thus consuming them due to move semantics (that's why it's called FnOnce: it there is a move involved, you can only call them once!).

We don't have to implement these traits manually for out closures: Rust infers them automatically based on the way we are manipulating our captured variables.

Let's see a couple of examples:

Fn:

fn main() {
    // A variable defined in the `main` scope
    let x = vec![1, 2, 3];
    // Let's create an exact copy
    let y = x.clone();
    
    // `equal_to_x` implements `Fn`:
    // we are only capturing an immutable reference
    // to a variable from the environment, `x`
    let equal_to_x = |z| z == &x;
    
    // We can call it multiple times...
    equal_to_x(vec![0, 2]);
    equal_to_x(vec![0, 1, 3]);
    equal_to_x(vec![0, 100]);

    // ...and the captured variables is unchanged
    assert_eq!(x, y);
}

FnMut:

fn main() {
    // A mutable variable defined in the `main` scope
    let mut x = vec![1, 2, 3];
    // Let's create an exact copy
    let y = x.clone();
    
    // `push_to_x` implements `FnMut`:
    // we are capturing a mutable reference
    // to a variable from the environment, `x`
    let mut push_to_x = |z| x.push(z);
    
    // We can call it multiple times...
    push_to_x(4);
    push_to_x(5);
    push_to_x(6);

    // ...and the captured variables is changed accordingly
    assert_ne!(x, y);
    assert_eq!(x, vec![1, 2, 3, 4, 5, 6]);
}

FnOnce:

fn main() {
    // A variable defined in the `main` scope
    let x = vec![1, 2, 3];
    
    // `push_to_x` implements `FnOnce`:
    // we are taking ownership of a variable 
    // from the environment, `x`, using the `move`
    // keyword
    let move_x = move || x;
    
    // We can only call it once
    move_x();
}

If we try to call move_x twice we get a compiler error:

error[E0382]: use of moved value: `move_x`
  --> src/main.rs:13:5
   |
12 |     move_x();
   |     ------ value moved here
13 |     move_x();
   |     ^^^^^^ value used here after move
   |
note: closure cannot be invoked more than once because 
it moves the variable `x` out of its environment
  --> src/main.rs:9:26
   |
9  |     let move_x = move || x;
   |                          ^

One important detail before moving on - there is a clear hierarchy between these traits:

all closures that implement Fn automatically implement FnMut and FnOnce;
all closures that implement FnMut automatically implement FnOnce.

All closures implement FnOnce - if you can't call them at least once, why would you even define them? ¯\_(ツ)_/¯

More details on closures can be found in the relevant section in the book.

The `map` method

If you think that was a deluge on information on function traits... well, you are right, it was (and we are not finished yet... 👀). As it always goes, everything will become clearer with practice, if put in the proper context.

Let's see how all of these applies to our integration problem. This is the line where we call our integrand function as a closure to compute its values on our point sequence:

let f_values = points.map(
    |x| x.sin()
);

The map method calls the provided closure on each element of the array (points). It sounds sensible to say that we can't use FnOnce as trait bound (unless we restrict ourselves to work with arrays containing a single element - a curious constraint). With respect to captured variables, I would say that we don't care too much if they are being taken as mutable or immutable references: as long as we can call this closure multiple times, we are good. FnMut seems to be the most reasonable choice at this point.³

How do we actually use FnMut in a function signature? Well, it turns out that you can't write down the type of a closure for reasons (if you are curious, take a look at the deep dive in this article or at Rust's language reference): you have to use a generic type parameter and constrain it using a trait bound (Fn, FnMut and FnOnce are out friends here!)

Let's give it a go with our rectangle_rule:

// We take as inputs the variables of our problem:
// - n, the number of sub-intervals,
// - a, the left end of the interval
// - b, the right end of the interval
// - f, our integrand function, as a closure
// To express `f` as an input parameter we introduce a type parameter, `F`,
// and we constrain `F` to implement the `FnMut` trait.
fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32 
where
    F: FnMut
{
    let points = midpoints(n, a, b);
    let f_values = points.map(f);
    let length = (b - a) / (n as f32);
    let integral_estimate = f_values.sum() * length;
    integral_estimate
}

Will it work?

Nope:

error[E0658]: the precise format of `Fn`-family traits' 
type parameters is subject to change. 
Use parenthetical notation (Fn(Foo, Bar) -> Baz) instead (see issue #29625)
  --> src/main.rs:28:8
   |
28 |     F: FnMut
   |        ^^^^^

error[E0107]: wrong number of type arguments: expected 1, found 0
  --> src/main.rs:28:8
   |
28 |     F: FnMut
   |        ^^^^^ expected 1 type argument

error: aborting due to 2 previous errors

If we put ourselves in the compiler's shoes, its complaint makes perfect sense. Yes, we are passing in a closure. Yes, it can be called multiple times because it implements FnMut. But don't you feel we are missing some key information? What parameters does it take as input? What does it return as output? The usual stuff, the basics, the stuff we forgot about while we were lost in our trait sidestory.

We can fix it real quick: we can use the parenthetical notation, as suggested by the compiler. In other words:

fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32 
where
    // We just have to spell out inputs and outputs,
    // with the same notation used for functions
    F: FnMut(&f32) -> f32

It compiles 🎉

The full code of our example looks like this now:

use ndarray::Array1;

fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
    // Quick check to ensure our inputs are valid
    assert!(a < b);
    assert!(n > 0);
    let length = (b - a) / (n as f32);
    let start = a + 0.5 * length;
    let end = b + 0.5 * length;
    Array1::range(start, end, length)
}

// We take as inputs the variables of our problem:
// - n, the number of sub-intervals,
// - a, the left end of the interval
// - b, the right end of the interval
// - f, our integrand function, as a closure
fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
where
    F: FnMut(&f32) -> f32
{
    let points = midpoints(n, a, b);
    let f_values = points.map(f);
    let length = (b - a) / (n as f32);
    let integral_estimate = f_values.sum() * length;
    integral_estimate
}

fn main() {
    let integral_estimate = rectangle_rule(129, 0., 1., |x| x.sin());
    println!("Estimate: {:?}.", integral_estimate);
}

It took a bit of study to get here, but the output looks quite readable - good job!

Correctness

It's time to answer the first question about our implementation: is it correct?

The rectangle rule returns us an estimate of the true integral, but it can be proved that if the integrand $f$ is a $\mathcal{C}^2([a, b])$ function (twice differentiable and all its derivatives up to the second order are continuous) then $$ \left| \text{Err}\left(\int_a^b{f}\right) \right| \leq \frac{b-a}{24}L_n^2 \max_{x\in[a,b]}{f''(x)} $$ A rigorous test suite should check that this estimate holds for a variety of functions, taking into account numerical errors due to floating point arithmetics. This, unfortunately, goes beyong the scope of this article.

For our purposes, we'll run rectangle_rule on a couple of smooth functions and check that the result is reasonably close to the analytical integral, for the chosen n. Given that we are dealing with floating numbers, we will take advantage of the approx crate: we can't assert a strict equality between the estimate and the expected result - we need to state how close we require them to be using one of its useful macro, assert_abs_diff_eq.

Let's see a couple of example:

// Import the PI constant from the standard library
use std::f32::consts::PI;
// Machine precision constant for f32
use std::f32::EPSILON;
use approx::assert_abs_diff_eq;

[...]

fn main() {
    let n = 513;

    // The integral of the sine function over its period should be zero
    let expected_sin = 0.;
    let estimate_sin = rectangle_rule(n, -PI, PI, |x| x.sin());
    // We require the estimated integral and the expected integral 
    // to agree up to the sixth decimal digit
    assert_abs_diff_eq!(expected_sin, estimate_sin, epsilon = 1e-6);

    // The integral of a constant, 1, is x. 
    // Given the interval, the result should be 1.
    let expected_linear = 1.;
    let estimate_linear = rectangle_rule(n, 0., 1., |x| 1.);
    // The rectangle rule should be exact for constants.
    // Thus we require the estimated integral and the expected integral
    // to agree up to machine precision
    assert_abs_diff_eq!(expected_linear, estimate_linear, epsilon = EPSILON);
}

Performance

We are now reasonably convinced of the correctness of our implementation. How fast is it?

Let's run a microbenchmark. We want to see how long it takes to compute an estimate of

$$ \int_0^1{sin(x),\text{d}x} $$

using the rectangle rule, with $n=513$.

For Rust, we use criterion⁴. For Python, we use %timeit rectangle_rule(513, 0, 1, np.sin) in a Jupyter notebook - the code is:

import numpy as np

# We assume f is a vectorized function
# Almost all the execution time is spent in NumPy's routines
def rectangle_rule(n, a, b, f):
    length = (b-a)/n
    points = np.arange(a+length/2, b+length/2, length, dtype=np.float32)
    return f(points).sum() * length

The Python version runs in ~9.3 microseconds. The Rust version takes ~2.2 microseconds - roughly 4 times faster.

Well, it feels good. What happens if we increase the number of sub-intervals? Let's say $n=50000$.

The Python version runs in 224 ± 7 microseconds, while the Rust version takes 231 ± 6 microseconds: we lost most of our edge - it seems that the overhead of calling NumPy's C code was significant for small arrays, but not so much for bigger ones.

Is there anything we can do to make our code run faster?

Let's have a second look at our function:

fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
where
    F: FnMut(&f32) -> f32
{
    let points = midpoints(n, a, b);
    // Map applies `f` to all elements in `points`
    // Then allocates a new array to hold the results
    let f_values = points.map(f);
    let length = (b - a) / (n as f32);
    let integral_estimate = f_values.sum() * length;
    integral_estimate
}

map is very cool, but it's creating a whole new array to hold the results of f applied to points' elements. Let's have a look at a couple more methods provided by ndarray:

map_mut takes a mutable reference to each element in the array (type: &mut f32) and it creates a new array to hold the results;
mapv takes each element in the array by value (type: f32) and it creates a new array to hold the results;
mapv_into takes each element in the array by value (type: f32) and it uses the original array to store the results, overriding the inputs. No additional memory is allocated.

mapv_into comes with an additional restriction: the output type of the closure has to be same of its input type, in order to be sure that we can reuse the input memory slot to store the result of the computation (e.g. an f64 does not fit into the 32 bit of an f32, as you can imagine).

We satisfy both requirements of mapv_into:

our closure takes an f32 value as input and it returns an f32 value;
computing f_values is all we need points for.

We can thus use mapv_into instead of map: points is a temporary variable which is never returned or seen by the caller of rectangle_rule - we can override its elements and avoid performing any additional memory allocation. The new code looks like this:

fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
where
    // The input is now `f32`, not `&f32`
    F: FnMut(f32) -> f32
{
    let points = midpoints(n, a, b);
    // points is consumed and its memory is reused to hold `f_values`
    let f_values = points.mapv_into(f);
    let length = (b - a) / (n as f32);
    let integral_estimate = f_values.sum() * length;
    integral_estimate
}

How does it affect our performance? The difference for $n=513$ is almost negligible, but for $n=50000$ we see a substantial speed-up:

The plot is a courtesy of criterion - it shows the estimated probability distribution (pdf, in the legend) of running times of our previous version (in red, using map) compared to the new version (in blue, using mapv_into). The average running time is now 205.71 ± 5.5 microseconds, slightly better than NumPy's 224 ± 7 microseconds 🚀

NumPy and SciPy are extremely valuable: they provide a high quality point of reference that can be used to check both correctness and performance of alternative (Rust) implementations. They push us to do better - that's precious.

Wrapping up

We started with a very simple task (the rectangle rule for integrals) and we managed to learn something new at every road block:

we started to touch ndarray with Array1, using a couple of constructors and a few methods;
map has been our gateway drug into the world of closures and function traits (definitely a tricky corner of the language, you should feel a bit of pride at this point!);
benchmarking against NumPy's has forced us to take a closer look at our code to identify opportunities where we could have been smarter in managing our resources - good training for our optimization eye!

It feels as if we are constantly breaking new ground, but we are slowly cementing the road behind us. In this episode we took advantage of everything we introduced in the previous ones: generic functions, trait bounds, ownership - you name it. Practice makes perfect: the more Rust we write, the more familiar these concepts will become.

In the next episode we will get serious with ndarray: we'll uncover the true nature of Array1, its connection to ArrayBase. We will thus discover how dimensions and ownership are expressed in ndarray. Exciting times ahead!

Notes, suggestions and feedbacks to improve this introduction are more than welcome: you can reach me on Twitter at @algo_luca.

Scientific Computing: A Rust adventure (TOC)

Part 0: Vectors
Part 1: Zero-cost abstractions
Part 2: Array1 (this post)

The last time this happened was when I was reading Neural Ordinary Differential Equations. If you have even a mild interest in neural networks, check it out - it's one of the most peculiar paper I happened to stumble upon in the area in the last couple of years. ²: If you are itching to flex your Rust skills, try to port SciPy's integration routines to Rust! ndarray-odeint focuses on solving ordinary differential equations, but a standalone crate dedicated to numerical integration is still missing. ³: You will not be surprised to find out that map uses exactly the same constraint on its input closure - you can check it out in ndarray's documentation, here. ⁴: You can look at the code used for benchmarking here.

ndarray::Array1

Scientific Computing: A Rust adventure (TOC)

Checkpoint

Numerical integration

The rectangle rule

Array1

Constructors

Transformations

Summing

Have we done a good job?

Function traits

The map method

Correctness

Performance

Wrapping up

Scientific Computing: A Rust adventure (TOC)

The `map` method