# Scientific computing: a Rust adventure [Part 2 - Array1]

April 2019 · 26 minute read

I can illustrate the […] approach with the […] image of a nut to be opened.
The first analogy that came to my mind is of immersing the nut in some softening liquid, and why not simply water? From time to time you rub so the liquid penetrates better, and otherwise you let time pass. The shell becomes more flexible through weeks and months — when the time is ripe, hand pressure is enough, the shell opens like a perfectly ripened avocado!
A different image came to me a few weeks ago. The unknown thing to be known appeared to me as some stretch of earth or hard marl, resisting penetration […] the sea advances insensibly in silence, nothing seems to happen, nothing moves, the water is so far off you hardly hear it […] yet finally it surrounds the resistant substance.
Alexander Grothendieck

## Checkpoint

In the previous episode we introduced several key concepts concerning Rust’s type system: generics, traits, operators, associated types, Copy. Empowered by this new knowledge, we were able to write a generic vector product, suitable to manipulate several numerical types under very mild constraints.

In this episode we will finally start using ndarray: we will introduce Array1, its one-dimensional array, and we will spend some time on closures and function traits. Our driving example is going to be numerical integration: we will write a very simple routine to compute the integral of well-behaved real-valued functions over closed 1-dimensional intervals.

## Numerical integration

In hindsight, nothing gave me more pleasure as a mathematician than reframing problems.
You have this thorny issue you are working on - all techniques you are familiar with lead you nowhere, you are not moving forward.
Then you start looking at the structures you are working with diagonally, from a weird angle. You rotate them in your mental eye, until it seems you are hallucinating: your objects looks like something else, something you know, something you have met somewhere else, something familiar, but different.
You can reframe your problem using new words, new structures, and those new words carry with them other tools, other theorems, other strategies you can use to finally get to the bottom of your problem.
It’s thrilling, you feel elated.1

This blog post will probably fail by a long shot to deliver such a powerful impact, but I promise you we will spending most of our time dealing with a problem that is crucial to a lot of wildly different applications (inside and outside of math): numerical integration.

Two noticeable examples:

We won’t be touching any state-of-art algorithm for numerical integration2. We will implement an elementary routine - a good occasion to introduce the core focus of this whole series: the ndarray crate. We will also introduce some new Rust language features to take full advantage of what ndarray provides us.

## The rectangle rule

Our goal is to compute

$$\int_a^b{f(x)\,\text{d}x}$$
where $a$ and $b$ are finite real numbers, $a < b$, and $f$ is a function of one real variable, i.e. $f: \mathbb{R}\to\mathbb{R}$.
The simplest approximation method is called rectangle rule (or midpoint rule):

• pick an integer $n$;
• divide the $[a, b]$ interval into $n$ sub-intervals of equal length;
$$a < a+\frac{1}{n}(b-a) < a+\frac{2}{n}(b-a) < \dots < a+\frac{n-1}{n}(b-a) < a+\frac{n}{n}(b-a)=b$$
• for each sub-interval $I_i$, build a rectangle $R_i$ with $I_i$ as basis and the function value in the midpoint of the $I_i$ sub-interval as height, $f(\frac{x_i + x_{i+1}}{2})$;
• estimate the integral $\int_a^b{f(x)\,\text{d}x}$ using the sum of the areas of all rectangles.

The smaller $n$, the better the estimate.
Visually, it looks like this (for increasingly bigger values of $n$):

We can get the formula out with a bit of algebra. Let’s call $x_i$ and $x_{i+1}$ the extremal points of the $i$-th sub-interval, $I_i$.
Then, the area of the corresponding rectangle is
$$Area(R_i) = (x_{i+1}-x_{i})f\left(\frac{x_i + x_{i+1}}{2}\right)$$
We built our sub-intervals by partioning $[a, b]$ in equal parts, hence
$$x_i = a+\frac{i}{n}(b-a)$$
Substituting it in the previous formula, we get:
\begin{align} Area(R_i)&=(x_{i+1}-x_{i})f\left(\frac{x_i + x_{i+1}}{2}\right) \\ &= \left(a+\frac{i+1}{n}(b-a)-\left(a+\frac{i}{n}(b-a)\right)\right)f\left(\frac{a+\frac{i+1}{n}(b-a)+a+\frac{i}{n}(b-a)}{2}\right) \\ &= \frac{(b-a)}{n}f\left(a+\frac{2i+1}{2n}(b-a)\right) \end{align}
We can now sum all these areas together to get our integral estimate:
\begin{align} \int_a^b{f\,\text{d}x}&=\sum_{i=0}^{n-1} Area(R_i) \\ &= \sum_{i=0}^{n-1} \frac{(b-a)}{n}f\left(a+\frac{2i+1}{2n}(b-a)\right) \\ &= \frac{(b-a)}{n} \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}\frac{(b-a)}{n}\right) \\ &= L_n \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}L_n\right) \end{align}
where $L_n=\frac{(b-a)}{n}$ is just the length of each sub-interval.

It’s now time to turn the midpoint rule into a Rust routine.

## Array1

Array1<T> is going to be our foothold into ndarray.
Array1<T> is roughly equivalent to Vec<T>: it’s a one-dimensional array which owns its elements.

We will point out their differences along the way.

### Constructors

We can create an instance of Array1<T> using the array macro:

1// We import what we need from the ndarray crate
2use ndarray::{Array1, array};
3
4// The syntax is exactly the same we have already seen
5// for the vec macro:
6//
7// array![<1st element>, <2nd element>, ..., <last element>]
8let a: Array1<i32> = array![0, -1, 2, 3];
9

Alternatively, we can first build an instance of Vec<T> and then convert it into an instance of Array1<T>:

 1use ndarray::{Array1, array};
2
3
4let a: Array1<i32> = array![0, -1, 2, 3];
6let vector = vec![0, -1, 2, 3];
7// from takes vector by value: vector is consumed due to move semantics
8// We can't reuse vector after having performed this conversion
9let b: Array1<i32> = Array1::from(vector);
10// It does not matter how we built them - the resulting arrays are identical
11assert_eq!(a, b);
12

The first step in computing
$$\int_a^b{f\,\text{d}x} = L_n \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}L_n\right)$$
is generating an Array1<f32> containing the sequence of midpoints, i.e.
$$\left\{ a+\frac{2i+1}{2}L_n \right\}_{i=0}^{n-1}$$
We can do it using a Vec<f32> as intermediate step:

 1use ndarray::Array1;
2
3// We take as inputs the variables of our problem:
4// - n, the number of sub-intervals,
5// - a, the left end of the interval
6// - b, the right end of the interval
7fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
8    // Quick check to ensure our inputs are valid
9    assert!(a < b);
10    assert!(n > 0);
11    // Initialise an empty vec
12    let mut v = vec![];
13    let length = (b - a) / n;
14    for i in 0..n {
15        // For each index, we compute the midpoint...
16        let midpoint = a + (i + 0.5) * length;
17        // ..and accumulate it in v
18        v.push(midpoint);
19    }
20    // We return the conversion result
21    Array1::from(v)
22}
23

It looks pretty straight-forward, but the compiler is not fully onboard:

 1error[E0277]: cannot divide f32 by u32
2  --> src/main.rs:14:21
3   |
414 |     let length = (b - a) / n;
5   |                     ^
6   |    no implementation for f32 / u32
7   |
8   = help: the trait std::ops::Div<u32> is
9           not implemented for f32
10
11error[E0277]: cannot add {float} to u32
12  --> src/main.rs:17:31
13   |
1417 |         let midpoint = a + (i + 0.5) * length;
15   |                               ^
16   |     no implementation for u32 + {float}
17   |
18   = help: the trait std::ops::Add<{float}> is
19           not implemented for u32

Rust does not perform implicit conversions (type casting) for numerical types.
If the types involved in the computation are not compatible (i.e. an implementation of the corresponding trait is missing, e.g. std::ops::Add<f32> for u32) the compiler will raise an error.
Type casting looks fairly innocent at a first glance, but several subtle bugs are lurking behind the corner if it’s approached carelessly - I suggest you to have a look at the relevant section in Rust By Example to get an idea of what scenarios are possible.

In our case, it is sufficient to perform an explicit conversion, using the as keyword. The middle section of our midpoints function becomes:

1let length = (b - a) / (n as f32);
2for i in 0..n {
3    // For each index, we compute the midpoint...
4    let midpoint = a + ((i as f32) + 0.5) * length;
5    // ..and accumulate it in v
6    v.push(midpoint);
7}
8

It compiles just fine now and we have our point sequence in an Array1, as desired.
It took a little bit of fiddling though, for what looks like a very common use case: generating a sequence of values from a starting point (a + 0.5 * length) using a fixed step (length) until the endpoint is reached (b - 0.5 * length).
We are also allocating memory more often than necessary - we know from the very beginning how many elements we are going to store.
Array1<T> has a dedicated method to take care of this very common pattern: it’s called range (equivalent to arange in NumPy). The previous snippet becomes:

 1use ndarray::Array1;
2
3// We take as inputs the variables of our problem:
4// - n, the number of sub-intervals,
5// - a, the left end of the interval
6// - b, the right end of the interval
7fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
8    // Quick check to ensure our inputs are valid
9    assert!(a < b);
10    assert!(n > 0);
11    // Step size
12    let length = (b - a) / (n as f32);
13    // Starting point (included in the generated array)
14    let start = a + 0.5 * length;
15    // End point (excluded from the generated array).
16    // The last point in the generated sequence is going to be:
17    // end - length = b - 0.5 * length
18    let end = b + 0.5 * length;
19    // The desired sequence of midpoints
20    Array1::range(start, end, length)
21}
22

### Transformations

We can now generate the sequence of midpoints for an arbitrary choice of a, b and n - the next step to evaluate
$$\int_a^b{f\,\text{d}x} = L_n \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}L_n\right)$$
is computing the value of $f$ in each of those points.

Let’s try to do it for $f(x) = sin(x)$.
This can be done using the map method and a closure:

1// Generate the point sequence using our midpoints routine
2let points = midpoints(129, 0., 1.);
3let f_values = points.map(
4    |x| x.sin()
5);
6

map borrows an immutable reference to each element in points (type: &f32), it applies the specified closure to it (|x| x.sin()) and returns a new Array1 instance containing the results of this operation.
A closure is an anonymous function, Rust’s version of Python’s lambdas: the function argument is specified between pipes (i.e. |x|) and the output of the function body (x.sin()) is automatically returned.
It is also possible to specify more complex function bodies using a block:

 1// Generate the point sequence using our midpoints routine
2let points = midpoints(129, 0., 1.);
3let f_values = points.map(
4    // The beginning of the block is marked by an open curly brace, {
5    |x| {
6        // This works just like a normal function body, taking x as input
7        let sin = x.sin();
8        let cos = x.cos();
9        // The last expression, given that there is no trailing ;,
10        // is automatically returned
11        cos + sin
12    // The end of the block is marked by a closed curly brace, }
13    }
14);
15

### Summing

We are almost there: we just need to sum our f_values and multiply the result by $L_n=(b-a) / n$.
It’s as easy as it sounds:

 1let a = 0.;
2let b = 1.;
3let n = 129;
4let points = midpoints(n, a, b);
5let f_values = points.map(
6    |x| x.sin()
7);
8let length = (b - a) / (n as f32);
9// Array1 provides a sum method
10let integral_estimate = f_values.sum() * length;
11

Ez.

## Have we done a good job?

Done… or not?
I’d say it would be wise to:

• test it on some easy integrals (for which we can compute the answer analytically);
• benchmark its performance.

To move forward with both objectives we need a little bit more flexibility. Our routine should become a function: a, b n and the function we want to integrate (|x| x.sin() so far) have to become parameters.

The first three should be fairly easy at this point:

 1use ndarray::Array1;
2
3fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
4    assert!(a < b);
5    assert!(n > 0);
6    let length = (b - a) / (n as f32);
7    let start = a + 0.5 * length;
8    let end = b + 0.5 * length;
9    Array1::range(start, end, length)
10}
11
12// We take as inputs the variables of our problem:
13// - n, the number of sub-intervals,
14// - a, the left end of the interval
15// - b, the right end of the interval
16fn rectangle_rule(n: u32, a: f32, b: f32) -> f32 {
17    let points = midpoints(n, a, b);
18    let f_values = points.map(
19        |x| x.sin()
20    );
21    let length = (b - a) / (n as f32);
22    let integral_estimate = f_values.sum() * length;
23    integral_estimate
24}
25

What is its type?

Yeah, that’s a thorny one.
If you have been following closely so far, you know what’s coming… a bunch of new traits! 🌟

### Function traits

If you want to start passing closures around like a pro you have to get familiar with the Fn traits provided by the standard library.
But, first of all, let’s give a little bit more context around closures.

So far we have been describing closures as anonymous functions. A syntactic convenience, if you want, that allows you to define a function inline without having to go through the declaration boilerplate.
Closures can do way more than that!

Closures can capture variables from the environment they are being defined into.
Let’s see it in action with an example (borrowed from the Book):

 1fn main() {
2    // A variable defined in the main scope
3    let x = 4;
4
5    // A closure that takes an input parameter (named z)
6    // and compares it with x... but x is not an input
7    // parameter of equal_to_x!
8    // x comes from the definition environment oof equal_to_x:
9    // we are capturing its value inside the closure!
10    let equal_to_x = |z| z == x;
11
12    let y = 4;
13
14    assert!(equal_to_x(y));
15}
16

It compiles and the assertion is satisfied.
If we try to do the same using a proper function…

 1fn main() {
2    // A variable defined in the main scope
3    let x = 4;
4
5    // We try to define equal_to_x as a function
6    // using x from the main scope in its body
7    fn equal_to_x(z: usize) -> bool {
8        z == x
9    }
10
11    let y = 4;
12
13    assert!(equal_to_x(y));
14}
15

…we get a compiler error:

1error[E0434]: can't capture dynamic environment in a fn item
2 --> src/main.rs:8:14
3  |
48 |         z == x
5  |              ^
6  |
7  = help: use the || { ... } closure form instead

Unsurprisingly, the compiler has seen what we were trying to do and suggested us to use a closure!

How does this fit into Rust’s ownership system?
The capturing mechanism follows the same rules. A variable in the environment can be captured:

• as immutable reference;
• as mutable reference;
• by value, using move semantics.

For each of these cases, there exists a corresponding trait in the standard library, as we anticipated:

• Fn is implemented by closures that can be called multiple times without changing the captured variables (they don’t consume the captured variables with move semantics and they don’t require them to be mutable);
• FnMut is implemented by closures that can be called multiple times with the possibility of mutating the captured variables (they don’t consume the captured variables with move semantics but they might require them to be mutable);
• FnOnce is implemented by closures that might take some of their captured variables by value, thus consuming them due to move semantics (that’s why it’s called FnOnce: it there is a move involved, you can only call them once!).

We don’t have to implement these traits manually for out closures: Rust infers them automatically based on the way we are manipulating our captured variables.

Let’s see a couple of examples:

• Fn:

 1fn main() {
2    // A variable defined in the main scope
3    let x = vec![1, 2, 3];
4    // Let's create an exact copy
5    let y = x.clone();
6
7    // equal_to_x implements Fn:
8    // we are only capturing an immutable reference
9    // to a variable from the environment, x
10    let equal_to_x = |z| z == &x;
11
12    // We can call it multiple times...
13    equal_to_x(vec![0, 2]);
14    equal_to_x(vec![0, 1, 3]);
15    equal_to_x(vec![0, 100]);
16
17    // ...and the captured variables is unchanged
18    assert_eq!(x, y);
19}
20
• FnMut:

 1fn main() {
2    // A mutable variable defined in the main scope
3    let mut x = vec![1, 2, 3];
4    // Let's create an exact copy
5    let y = x.clone();
6
7    // push_to_x implements FnMut:
8    // we are capturing a mutable reference
9    // to a variable from the environment, x
10    let mut push_to_x = |z| x.push(z);
11
12    // We can call it multiple times...
13    push_to_x(4);
14    push_to_x(5);
15    push_to_x(6);
16
17    // ...and the captured variables is changed accordingly
18    assert_ne!(x, y);
19    assert_eq!(x, vec![1, 2, 3, 4, 5, 6]);
20}
21
• FnOnce:

 1fn main() {
2    // A variable defined in the main scope
3    let x = vec![1, 2, 3];
4
5    // push_to_x implements FnOnce:
6    // we are taking ownership of a variable
7    // from the environment, x, using the move
8    // keyword
9    let move_x = move || x;
10
11    // We can only call it once
12    move_x();
13}
14

If we try to call move_x twice we get a compiler error:

 1error[E0382]: use of moved value: move_x
2  --> src/main.rs:13:5
3   |
412 |     move_x();
5   |     ------ value moved here
613 |     move_x();
7   |     ^^^^^^ value used here after move
8   |
9note: closure cannot be invoked more than once because
10it moves the variable x out of its environment
11  --> src/main.rs:9:26
12   |
139  |     let move_x = move || x;
14   |                          ^

One important detail before moving on - there is a clear hierarchy between these traits:

• all closures that implement Fn automatically implement FnMut and FnOnce;
• all closures that implement FnMut automatically implement FnOnce.

All closures implement FnOnce - if you can’t call them at least once, why would you even define them? ¯\_(ツ)_/¯

More details on closures can be found in the relevant section in the book.

### The map method

If you think that was a deluge on information on function traits… well, you are right, it was (and we are not finished yet… 👀). As it always goes, everything will become clearer with practice, if put in the proper context.

Let’s see how all of these applies to our integration problem.
This is the line where we call our integrand function as a closure to compute its values on our point sequence:

1let f_values = points.map(
2    |x| x.sin()
3);
4

The map method calls the provided closure on each element of the array (points). It sounds sensible to say that we can’t use FnOnce as trait bound (unless we restrict ourselves to work with arrays containing a single element - a curious constraint).
With respect to captured variables, I would say that we don’t care too much if they are being taken as mutable or immutable references: as long as we can call this closure multiple times, we are good. FnMut seems to be the most reasonable choice at this point.3

How do we actually use FnMut in a function signature?
Well, it turns out that you can’t write down the type of a closure for reasons (if you are curious, take a look at the deep dive in this article or at Rust’s language reference): you have to use a generic type parameter and constrain it using a trait bound (Fn, FnMut and FnOnce are out friends here!)

Let’s give it a go with our rectangle_rule:

 1// We take as inputs the variables of our problem:
2// - n, the number of sub-intervals,
3// - a, the left end of the interval
4// - b, the right end of the interval
5// - f, our integrand function, as a closure
6// To express f as an input parameter we introduce a type parameter, F,
7// and we constrain F to implement the FnMut trait.
8fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
9where
10    F: FnMut
11{
12    let points = midpoints(n, a, b);
13    let f_values = points.map(f);
14    let length = (b - a) / (n as f32);
15    let integral_estimate = f_values.sum() * length;
16    integral_estimate
17}
18

Will it work?

Nope:

 1error[E0658]: the precise format of Fn-family traits'
2type parameters is subject to change.
3Use parenthetical notation (Fn(Foo, Bar) -> Baz) instead (see issue #29625)
4  --> src/main.rs:28:8
5   |
628 |     F: FnMut
7   |        ^^^^^
8
9error[E0107]: wrong number of type arguments: expected 1, found 0
10  --> src/main.rs:28:8
11   |
1228 |     F: FnMut
13   |        ^^^^^ expected 1 type argument
14
15error: aborting due to 2 previous errors

If we put ourselves in the compiler’s shoes, its complaint makes perfect sense.
Yes, we are passing in a closure.
Yes, it can be called multiple times because it implements FnMut.
But don’t you feel we are missing some key information? What parameters does it take as input? What does it return as output? The usual stuff, the basics, the stuff we forgot about while we were lost in our trait sidestory.

We can fix it real quick: we can use the parenthetical notation, as suggested by the compiler. In other words:

1fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
2where
3    // We just have to spell out inputs and outputs,
4    // with the same notation used for functions
5    F: FnMut(&f32) -> f32
6

It compiles 🎉

The full code of our example looks like this now:

 1use ndarray::Array1;
2
3fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
4    // Quick check to ensure our inputs are valid
5    assert!(a < b);
6    assert!(n > 0);
7    let length = (b - a) / (n as f32);
8    let start = a + 0.5 * length;
9    let end = b + 0.5 * length;
10    Array1::range(start, end, length)
11}
12
13// We take as inputs the variables of our problem:
14// - n, the number of sub-intervals,
15// - a, the left end of the interval
16// - b, the right end of the interval
17// - f, our integrand function, as a closure
18fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
19where
20    F: FnMut(&f32) -> f32
21{
22    let points = midpoints(n, a, b);
23    let f_values = points.map(f);
24    let length = (b - a) / (n as f32);
25    let integral_estimate = f_values.sum() * length;
26    integral_estimate
27}
28
29fn main() {
30    let integral_estimate = rectangle_rule(129, 0., 1., |x| x.sin());
31    println!("Estimate: {:?}.", integral_estimate);
32}
33

It took a bit of study to get here, but the output looks quite readable - good job!

### Correctness

It’s time to answer the first question about our implementation: is it correct?

The rectangle rule returns us an estimate of the true integral, but it can be proved that if the integrand $f$ is a $\mathcal{C}^2([a, b])$ function (twice differentiable and all its derivatives up to the second order are continuous) then
$$\left| \text{Err}\left(\int_a^b{f}\right) \right| \leq \frac{b-a}{24}L_n^2 \max_{x\in[a,b]}{f''(x)}$$
A rigorous test suite should check that this estimate holds for a variety of functions, taking into account numerical errors due to floating point arithmetics.

For our purposes, we’ll run rectangle_rule on a couple of smooth functions and check that the result is reasonably close to the analytical integral, for the chosen n.
Given that we are dealing with floating numbers, we will take advantage of the approx crate: we can’t assert a strict equality between the estimate and the expected result - we need to state how close we require them to be using one of its useful macro, assert_abs_diff_eq.

Let’s see a couple of example:

 1// Import the PI constant from the standard library
2use std::f32::consts::PI;
3// Machine precision constant for f32
4use std::f32::EPSILON;
5use approx::assert_abs_diff_eq;
6
7[...]
8
9fn main() {
10    let n = 513;
11
12    // The integral of the sine function over its period should be zero
13    let expected_sin = 0.;
14    let estimate_sin = rectangle_rule(n, -PI, PI, |x| x.sin());
15    // We require the estimated integral and the expected integral
16    // to agree up to the sixth decimal digit
17    assert_abs_diff_eq!(expected_sin, estimate_sin, epsilon = 1e-6);
18
19    // The integral of a constant, 1, is x.
20    // Given the interval, the result should be 1.
21    let expected_linear = 1.;
22    let estimate_linear = rectangle_rule(n, 0., 1., |x| 1.);
23    // The rectangle rule should be exact for constants.
24    // Thus we require the estimated integral and the expected integral
25    // to agree up to machine precision
26    assert_abs_diff_eq!(expected_linear, estimate_linear, epsilon = EPSILON);
27}
28

### Performance

We are now reasonably convinced of the correctness of our implementation. How fast is it?

Let’s run a microbenchmark.
We want to see how long it takes to compute an estimate of

$$\int_0^1{sin(x)\,\text{d}x}$$

using the rectangle rule, with $n=513$.

For Rust, we use criterion4.
For Python, we use %timeit rectangle_rule(513, 0, 1, np.sin) in a Jupyter notebook - the code is:

1import numpy as np
2
3# We assume f is a vectorized function
4# Almost all the execution time is spent in NumPy's routines
5def rectangle_rule(n, a, b, f):
6    length = (b-a)/n
7    points = np.arange(a+length/2, b+length/2, length, dtype=np.float32)
8    return f(points).sum() * length

The Python version runs in ~9.3 microseconds.
The Rust version takes ~2.2 microseconds - roughly 4 times faster.

Well, it feels good.
What happens if we increase the number of sub-intervals?
Let’s say $n=50000$.

The Python version runs in 224 ± 7 microseconds, while the Rust version takes 231 ± 6 microseconds: we lost most of our edge - it seems that the overhead of calling NumPy’s C code was significant for small arrays, but not so much for bigger ones.

Is there anything we can do to make our code run faster?

Let’s have a second look at our function:

 1fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
2where
3    F: FnMut(&f32) -> f32
4{
5    let points = midpoints(n, a, b);
6    // Map applies f to all elements in points
7    // Then allocates a new array to hold the results
8    let f_values = points.map(f);
9    let length = (b - a) / (n as f32);
10    let integral_estimate = f_values.sum() * length;
11    integral_estimate
12}
13

map is very cool, but it’s creating a whole new array to hold the results of f applied to points’ elements.
Let’s have a look at a couple more methods provided by ndarray:

• map_mut takes a mutable reference to each element in the array (type: &mut f32) and it creates a new array to hold the results;
• mapv takes each element in the array by value (type: f32) and it creates a new array to hold the results;
• mapv_into takes each element in the array by value (type: f32) and it uses the original array to store the results, overriding the inputs. No additional memory is allocated.

mapv_into comes with an additional restriction: the output type of the closure has to be same of its input type, in order to be sure that we can reuse the input memory slot to store the result of the computation (e.g. an f64 does not fit into the 32 bit of an f32, as you can imagine).

We satisfy both requirements of mapv_into:

• our closure takes an f32 value as input and it returns an f32 value;
• computing f_values is all we need points for.

We can thus use mapv_into instead of map: points is a temporary variable which is never returned or seen by the caller of rectangle_rule - we can override its elements and avoid performing any additional memory allocation.
The new code looks like this:

 1fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
2where
3    // The input is now f32, not &f32
4    F: FnMut(f32) -> f32
5{
6    let points = midpoints(n, a, b);
7    // points is consumed and its memory is reused to hold f_values
8    let f_values = points.mapv_into(f);
9    let length = (b - a) / (n as f32);
10    let integral_estimate = f_values.sum() * length;
11    integral_estimate
12}
13

How does it affect our performance?
The difference for $n=513$ is almost negligible, but for $n=50000$ we see a substantial speed-up:

The plot is a courtesy of criterion - it shows the estimated probability distribution (pdf, in the legend) of running times of our previous version (in red, using map) compared to the new version (in blue, using mapv_into).
The average running time is now 205.71 ± 5.5 microseconds, slightly better than NumPy’s 224 ± 7 microseconds 🚀

NumPy and SciPy are extremely valuable: they provide a high quality point of reference that can be used to check both correctness and performance of alternative (Rust) implementations.
They push us to do better - that’s precious.

## Wrapping up

We started with a very simple task (the rectangle rule for integrals) and we managed to learn something new at every road block:

• we started to touch ndarray with Array1, using a couple of constructors and a few methods;
• map has been our gateway drug into the world of closures and function traits (definitely a tricky corner of the language, you should feel a bit of pride at this point!);
• benchmarking against NumPy’s has forced us to take a closer look at our code to identify opportunities where we could have been smarter in managing our resources - good training for our optimization eye!

It feels as if we are constantly breaking new ground, but we are slowly cementing the road behind us. In this episode we took advantage of everything we introduced in the previous ones: generic functions, trait bounds, ownership - you name it.
Practice makes perfect: the more Rust we write, the more familiar these concepts will become.

In the next episode we will get serious with ndarray: we’ll uncover the true nature of Array1, its connection to ArrayBase. We will thus discover how dimensions and ownership are expressed in ndarray.

Notes, suggestions and feedbacks to improve this introduction are more than welcome: you can reach me on Twitter at @algo_luca.

## Scientific Computing: A Rust adventure (TOC)

1. The last time this happened was when I was reading Neural Ordinary Differential Equations. If you have even a mild interest in neural networks, check it out - it’s one of the most peculiar paper I happened to stumble upon in the area in the last couple of years.
[return]
2. If you are itching to flex your Rust skills, try to port SciPy’s integration routines to Rust! ndarray-odeint focuses on solving ordinary differential equations, but a standalone crate dedicated to numerical integration is still missing.
[return]
3. You will not be surprised to find out that map uses exactly the same constraint on its input closure - you can check it out in ndarray’s documentation, here.
[return]
4. You can look at the code used for benchmarking here.
[return]