Rust programming language is gaining popularity rapidly in building reliable and secure systems due to its security guarantees and outstanding performance. To provide extra functionalities, the Rust compiler introduces Rust unstable features (RUF) to extend compiler functionality, syntax, and standard library support. However, these features are unstable and may get removed, introducing compilation failures to dependent packages. Even worse, their impacts propagate through transitive dependencies, causing large-scale failures in the whole ecosystem. Although RUF is widely used in Rust, previous research has primarily concentrated on Rust code safety, with the usage and impacts of RUF from the Rust compiler remaining unexplored.
Therefore, we aim to bridge this gap by systematically analyzing the RUF usage and impacts in the Rust ecosystem. We propose novel techniques for extracting RUF precisely, and to assess its impact on the entire ecosystem quantitatively, we accurately resolve package dependencies. We have analyzed the whole Rust ecosystem with 590K package versions and 140M transitive dependencies. Our study shows that the Rust ecosystem uses 1000 different RUF, and at most 44% of package versions are affected by RUF, causing compiling failures for at most 12%. To mitigate wide RUF impacts, we further design and implement a RUF compilation failure recovery tool that can recover up to 90% of the failure. We believe our techniques, findings, and tools can help to stabilize the Rust compiler, ultimately enhancing the security and reliability of the Rust ecosystem.
On this website, we describe how we use our tools to collect, generate, and analyze data to support our RUF research. We also provide complete data behind the paper, give guidance to reproduce the data, and validate our tools.
To investigate RUF (Rust Unstable Features) usage and impacts in the Rust ecosystem, we conduct step-by-step tools and empirical analysis, including THREE parts. These parts are presented as different tools and RQs (Research Questions) to drive our research topics.
RUF Lifetime: We extract and track all RUF that the Rust compiler supports to reveal RUF status changes over time, especially abnormal RUF lifetime evolution.
RUF Usage: We extract and understand RUF usages in Rust packages. Using RUF Status (Lifetime) data, we can understand how packages use RUF with different statuses.
RUF Impact: We accurately and efficiently resolve dependencies in the Rust ecosystem. With RUF Usage data, we can determine the RUF impacts across the ecosystem.
There is an extra part called "RUF Impact Mitigation", where we implement tools to help packages recover from compilation failure.
Our data are generated through roughly three steps:
Rust ecosystem metadata from Rust official database.
Ecosystem raw data: The RUF extractor and ecosystem dependency generator tools will resolve metadata and produce raw data that drives our research topics.Â
Research results: To finally reveal RUF findings and answer RQs, we further analyze the raw data using our scripts and present them in the paper.
The results from each step can be found on this website for further research and validation. We provide tools, documents and build scripts to help reproduce the results. We also systematically evaluate our ecosystem dependency graph generator and RUF impact mitigation tool.
We first give detailed "Research Results" presented in the RQ sections in our paper, which are the results in step 3. See page "Research Results" for detail.
After that, we introduce our tools which generate ecosystem raw data, and validate them to make sure they work properly and accurately. To help reproduce the results, we give Rust ecosystem metadata (Step 1) and ecosystem raw data (Step 2). Also, we give analysis scripts that generate our research results based on the raw data. See page "Tools and Validation" for detail.
Beyond the RUF study, our tools, scripts, and data can be extended for further research topics and applications.
Vulnerability Detection and Propagation: We have collected vulnerability metadata and bound them with the Rust Ecosystem Graph to identify the vulnerability impact in the ecosystem level. We found that due to the centralization of the Rust ecosystem, the vulnerability impact becomes very huge. The related tools are included in our source code.
Ecosystem Analysis in Other Dimensions: Super-spreaders of maintainers and packages are further examined in our analysis. We found that there are some maintainers and packages that can impact a wide variety of packages in the ecosystem. As the recent news shows that the Rust teams are not so stable, the super-spreader maintainers (lots are from the Rust team) may be a "weak point" of the ecosystem. Also, we find that some packages are not updated for years, but still own millions of downloads each year. Our ecosystem-level analyzer and database can be used to find similar findings easily.
Dependency Conflict: During the Ecosystem Dependency Graph generation process, we discovered that many packages are suffering from dependency resolution failure due to yanked packages or improper dependency configurations. We have analyzed some of them manually and found that the dependency conflict can be somehow recovered by modifying the dependency configurations. Moreover, we discovered some unstable dependency configurations that can easily cause packages to fail the dependency resolution. More dependency resolution findings are under research.
The extensibility of our technique reveals the significance of our proposed new technique, and we hope researchers can make benefit of our open-source data and tools to analyze the Rust ecosystem and the Rust compiler.
We put all data, codes, docker images, and other resources used by our research here. They will all be public to the community after paper acception.
Data: Download here.
Tools (Source Code): See here to view the source code. You can also download it here.
Docker Image: The docker image can be generated by dockerfile in the source code, and the running configurations can be also found there in the documentation.