Boosting Rust Performance: Insights from Version 1.64
Written on
Chapter 1: Performance Enhancements in Rust 1.64
For developers working with Rust, the release of version 1.64 on September 22, 2022, was a reason to celebrate, as it boasted a performance increase of 10-20% on Windows systems. This significant improvement was among the key updates, alongside the stabilization of several APIs. But what led to such a remarkable enhancement?
The breakthrough came from a merge request submitted by Rémy Rakic, known as lqd (pronounced "liquid"), a French software engineer who frequently contributes to the Rust compiler. On May 12, 2022, he presented a merge request supported by performance test results:
His efforts demonstrated a consistent performance boost across various tests, with improvements reaching up to 18.92%. Out of hundreds of tests conducted, nearly all showed significant enhancements, with only a couple experiencing minor decreases:
Such outcomes are the ultimate goal for anyone focused on performance optimization. On July 11, 2022, Jakub Beránek shared the positive news after two months of dedicated work by lqd to ensure the merge request passed all tests and was fine-tuned. An additional two months were required to finalize the release, illustrating the substantial effort invested in achieving these results.
Section 1.1: Understanding Profile-Guided Optimization
At the heart of this improvement lies profile-guided optimization (PGO). The Rust compiler employs PGO, a collection of techniques designed to prepare applications, analyze their execution patterns, and optimize them for the swift execution of critical code.
The process consists of three phases: instrumentation, training, and optimization. In the first phase, the application is enhanced with data-gathering points to track function calls and their execution order. Next, the application is executed multiple times to collect comprehensive data. Finally, various techniques are applied to identify how the code can be refined to enhance performance.
Subsection 1.1.1: Inlining Techniques
One notable technique within this framework is inlining. This method stems from the observation that an excessive number of functions can slow down execution due to the overhead involved in locating and processing them. Inlining identifies frequently called functions and integrates them directly into the calling function. This approach does not significantly increase the build size but delivers substantial performance benefits. For a deeper understanding, refer to Ankit Astana's article, which provides an excellent illustrated explanation.
Section 1.2: Machine Code Layout Optimization
To grasp this optimization technique, it's essential to recognize that Rust code compiles down to assembly code—the language understood by the processor. The processor executes assembly instructions sequentially. When the code lacks flow control instructions, the processor can load and process multiple lines simultaneously, maximizing execution speed.
Although it is challenging to consistently write code with an awareness of "hot" and "cold" paths—those that are frequently or infrequently executed—profile-guided optimization helps streamline this process. By reorganizing the assembly code layout according to specific rules, the compiler generates longer, more efficient code segments, leading to improved Rust performance. For more details, check out Sergey Slotin's informative article.
Chapter 2: Register Allocation and Its Impact
Another pivotal optimization technique relates to register allocation. Registers are the fastest data storage locations, allowing the processor to access data in mere nanoseconds. However, the total number of registers is limited, often fewer than ten.
Effective register allocation involves complex algorithms that determine which data should reside in registers and when. By optimizing this allocation process, the Rust compiler can significantly enhance performance. For those interested, a comprehensive overview is available on Wikipedia.
Section 2.1: Performance-Guided Optimization on Windows
Now that you are familiar with some performance-guided optimization techniques, it's important to note that the Rust compiler supports them. However, until recently, PGO was exclusively available on Linux, primarily because Rust has a large user base on that platform.
This limitation meant that rustc couldn't deliver optimal performance on other operating systems, such as Windows. lqd's contributions involved enabling these optimization techniques for Windows environments. The skill and determination required to achieve this feat are commendable, making his merge request one of the most impressive I have encountered in recent times. The well-documented commit history also offers a valuable resource for those interested in compiler development.
The first video titled "RUST: Dramatically increase performance / FPS with any setup! 2019 UPDATE - YouTube" explores various methods to optimize Rust's performance, providing practical insights.
The second video, "BEST RUST SETTINGS 2024, STRETCHED RES AND PC OPTIMIZATION - YouTube," discusses the best settings for Rust in 2024, focusing on stretched resolution and PC optimization techniques.
Section 2.2: Broader Implications of Compiler Enhancements
The implications of compiler advancements like those seen in Rust 1.64's PGO for Windows extend well beyond mere speed. PGO techniques enhance processing efficiency, leading to lower data processing costs for data centers and enabling the development of faster, more cost-effective devices that consume less power and maintain longer battery life.
In a time when energy conservation is increasingly vital, these improvements are crucial. Thank you for taking the time to read this article. If you would like to explore more topics like this, consider subscribing, sharing, or leaving a clap!
Brought to you by Tom Smykowski
Source: Merge Request
Stay current with the latest in software engineering by subscribing to Tom Smykowski’s articles, where over 1,500 readers already enjoy insightful content. For just $5 a month, gain access to not only Tom's work but also all articles on Medium, enabling him to create even more engaging stories. Join now!