It’s time to port the Python LOESS code to Rust.
Over five years ago, counting from this writing, I published my most successful article here on Medium. That article grew from the need to filter a particularly noisy sensor’s data from a telematics data stream. Concretely, it was a torque sensor connected to a truck’s drive shaft and the noise needed to go. LOESS was the answer, hence that article.
By then, I was neck-deep in Python, and the project required Spark, so implementing the algorithm in Python was a no-brainer. Times change, though, and now I use Rust more frequently and decided to have a go at translating the old code. This article describes the porting process and my choices when rewriting the code. You should read the original article and the reference material to learn more about the algorithm. Here, we will focus on the intricacies of writing matrix code in Rust, replacing the earlier NumPy implementation as closely as possible.
Rust Numerical Computing
Being a firm believer in not reinventing the wheel, I searched for the recommended Rust crates to replace my use of NumPy in the original Python code, and it didn’t take long to find nalgebra.
nalgebra is meant to be a general-purpose, low-dimensional, linear algebra library, with an optimized set of tools for computer graphics and physics.
Although we will not do any physics or computer graphics, we fit the low dimensionality requirement like a glove.
Differences
When converting the Python code to Rust, I met some difficulties that took me a while to sort out. When using NumPy in Python, we use all the features that both language and library provide to improve the code’s expressiveness and readability. Rust is more verbose than Python, and, at the time of this writing (version 0.33.0), the nalgebra crate still misses some features that help improve its expressiveness. Terseness is a challenge.
My first hurdle was indexing arrays using other arrays. With NumPy, you can index an array using another array of integers or booleans. In the first case, each element of the indexing array is an index into the source array, and the indexer may have a dimension equal to or smaller than the data array. In the case of boolean indexing, the indexer must have the same size as the data, and each element must state whether to include the corresponding data element. This feature is handy when using boolean expressions to select data.
Handy as it is, I used this feature throughout the Python code:
# Python
xx = self.n_xx[min_range]
Here, the min_range variable in an integer array containing the subset of indices to retrieve from the self.n_xx array.
Try as I might, I could not find a solution in the Rust crate that mimics the NumPy indexing, so I had to implement one. After a couple of tries and benchmarks, I reached the final version. This solution was straightforward and effective.
// Rust
fn select_indices(values: &DVector<f64>,
indices: &DVector<usize>) -> DVector<f64> {
indices.map(|i| values[i])
}
The map expression is quite simple, but using the function name is more expressive, so I replaced the Python code above with the corresponding Rust one:
// Rust
let xx = select_indices(&self.xx, min_range);
There is also no built-in method to create a vector from a range of integers. Although easy to do with nalgebra, the code becomes a bit long:
// Rust
range = DVector::<usize>::from_iterator(window, 0..window);
We can avoid much of this ceremony if we fix the vector and array sizes during compilation, but we have no such luck here as the dimensions are unknown. The corresponding Python code is more terse:
# Python
np.arange(0, window)
This terseness also extends to other areas, such as when filling a matrix row-wise. In Python, we can do something like this:
# Python
for i in range(1, degree + 1):
xm[:, i] = np.power(self.n_xx[min_range], i)
As of this writing, I found no better way of doing the same thing with nalgebra than this:
// Rust
for i in 1..=degree {
for j in 0..window {
xm[(j, i)] = self.xx[min_range[j]].powi(i as i32);
}
}
Maybe something hidden in the package is waiting to be discovered that will help here in terms of conciseness.
Finally, I found the nalgebra documentation relatively sparse. We can expect this from a relatively young Rust crate that holds much promise for the future.
The Upside
The best comes at the end—the raw performance. I invite you to try running both versions of the same code (the GitHub repository links are below) and compare their performances. On my 2019 MacBook Pro 2.6 GHz 6-Core Intel Core i7, the release version of the Rust code runs in under 200 microseconds, while the Python code runs in under 5 milliseconds.
Conclusion
This project was another exciting and educative Python-to-Rust port of my old code. While converting from the well-known Python control structures to Rust gets more accessible by the day, the NumPy conversion to nalgebra was more of a challenge. The Rust package shows much promise but needs more documentation and online support. I would warmly welcome a more thorough user guide.
Rust is more ceremonious than Python but performs much better when properly used. I will keep using Python for my daily work when building prototypes and in discovery mode, but I will turn to Rust for performance and memory safety when moving to production. We can even mix and match both using crates like PyO3, so this is a win-win scenario.
Rust rocks!
References
joaofig/loess-rs: An implementation of the LOESS / LOWESS algorithm in Rust. (github.com)
joaofig/pyloess: A simple implementation of the LOESS algorithm using numpy (github.com)
Credits
I used Grammarly to review the writing and accepted several of its rewriting suggestions.
JetBrains’ AI assistant helped me write some of the code, and I also used it to learn Rust. It has become a staple of my everyday work with both Rust and Python. Unfortunately, support for nalgebra is still short.
João Paulo Figueira is a Data Scientist at tb.lx by Daimler Truck in Lisbon, Portugal.
LOESS in Rust was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
LOESS in Rust