Tools and Validation

Overview

In general, our research topics can be included by the following figure. However, this emits detailed implementation information for simplicity. On this website, we discuss all implementation details of tools and data used by our tools. We use decoupled design to make sure that our tools and data can be extensively used for broader purposes, so the structure may be a little complicated.

RUF analysis tools, build scripts, and documentation are all available in the repository. We first discuss the structure of our tools and their relations for cooperation. Then we present the tool validation process and dataset. Last, we add guidance on how to reproduce our research results.

Tool Structure

The following figure describes the actual implementation of our tools. Before you continue reading, make sure you know what our research focuses on and the research results we want to generate.

The figure shows how data are used by our tools and how they eventually produce research results. The data in the figure are represented as plain text, while our tools are represented as rectangle text. The names of the tools are the same as the ones in our repository. The detailed build scripts and documentation are also in the repository.

In general, we first collect various types of data from the Rust ecosystem, and extract or generate ecosystem raw data. The data are all stored in the PostgreSQL database for fast query, which only takes at most 2 min to process over 140M transitive dependencies. By maintaining the database, we can do a lot more than RUF study in the Rust ecosystem (see extensibility in the home page for more). We further write SQL scripts to process the raw data to get research results that form the final version of our research paper.

To mitigate RUF impacts, we also developed RUF Detector named cargo_ruf to detect given Rust projects to provide useful information on enabled RUF. Most importantly, it automatically tries to recover the package if it suffers from compilation failure introduced by enabled RUF. For compatibility concerns, it can also be integrated into Cargo, which is the Rust official package manager.

Reproduce Our Results

The figure shown above shows the input and output of each tool we developed in our research. The detailed configurations and build guidance can be seen in the GitHub repository. Also, to better reproduce the results, we provide docker image with the execution environment and database to reproduce the data. To make things work better, we also provide ecosystem raw data, so you don't have to download 100GB source codes, maintain the complete dependency index, etc.

Get started by downloading our source code here in the "Resouce Download" section.

Validation

EDG Accuracy

We validate our Ecosystem Dependency Graph (EDG) accuracy by comparing it with the official dependency resolution tool Cargo Tree from Cargo-1.63.0.

Benchmark Setup:

We resolute four types of dependency: build, common, optional, and target. Only development dependencies are omitted as they will not affect the runtime of programs, which is the same as our resolution rules.

We first download source code from the official database Crates.io, and then use Cargo Tree to resolve dependencies in the real environment. After that, we will compare dependency items from Cargo Tree and our ecosystem dependency graph.

We treat each dependent version as a dependency and the sum of dependent versions as a dependency tree in the evaluation process. This is because we only care about whether a specific package version impacts the root package in the dependency tree rather than how it impacts the dependency tree.

We use four indexes to represent accuracy shown below. Tree Accuracy stands for the resolution accuracy of the entire dependency tree. Recall and Precision represent Right percentage in the standard dependency and resolved dependencies data set, respectively. F_{1}-score is the harmonic mean of recall and precision, which can represent the accuracy of resolution.

Accuracy Definition:

We define four types of comparison results given package \textit{i} in the accuracy evaluation:

1) Right (R_i): Dependencies that occur in both dependency data sets with the same versions.

2) Wrong (W_i): Dependencies that occur in both dependency data sets with different versions.

3) Over (O_i): Dependencies that only occur in our resolution data set.

4) Miss (M_i): Dependencies that only occur in standard data sets.

Additional Change on Local Configurations:

In the evaluation process, we observed that the dependency configuration behaves slightly differently when it is uploaded to the ecosystem rather than built locally. The configuration file in the source code may force developers to use a specific version of the package manager, resolver, or compiler during the local development of the built package.

Furthermore, it will probably use local packages instead of packages from Crates.io. These operations are forbidden when they are uploaded to Crates.io and used by other packages. This configuration setting is mainly used for local environments but not for other developers who want to use the functionalities of this package. As a result, our evaluation process removes local configurations to keep consistent with the Rust ecosystem behavior.

EDG Resolution Accuracy Results:

Complete Results of EDG Generator Validation

Here we list the detail of the dataset we choose and the results from Cargo Tree and EDG.

To evaluate accuracy in the different data sets, we select 2000 packages from the whole ecosystem as our standard dependency benchmark data set. Due to local configurations, a small proportion of packages can't be successfully recognized by Cargo Tree, so our final package count will be a little less than 2000. We use three strategies to select these packages:

1) Random: Randomly selected versions.

2) Popular: Latest versions of packages that have the most downloads.

3) Mostdep: Latest versions of packages that have the most direct dependencies, which is the most complex situation a resolver will meet.

We select the latest versions of the given package because it is chosen to be the dependency package version by default and typically has the most complex dependencies.

You can click the button for complete raw validation results. We also list the detailed validation summary below.

Download Raw Validation Data

Dataset: random

crate_count = 1981

match_count = 1969

cargotree_crates_num = 94254

pipeline_crates_num = 93422

overresolve_dep = 170

right_dep = 93201

wrong_dep = 51

missing_dep = 1008

Resolution Accuracy Summary:

Tree Accuracy = 99.394%

Precision = 99.763%

Recall = 98.930%

F1Score = 99.345%

Dataset: popular

crate_count = 1983

match_count = 1973

cargotree_crates_num = 45346

pipeline_crates_num = 44968

overresolve_dep = 1

right_dep = 44966

wrong_dep = 1

missing_dep = 378

Resolution Accuracy Summary:

Tree Accuracy = 99.496%

Precision = 99.996%

Recall = 99.166%

F1Score = 99.579%

Dataset: mostdep

crate_count = 1991

match_count = 1932

cargotree_crates_num = 391658

pipeline_crates_num = 381915

overresolve_dep = 89

right_dep = 381759

wrong_dep = 67

missing_dep = 9832

Resolution Accuracy Summary:

Tree Accuracy = 97.037%

Precision = 99.959%

Recall = 97.489%

F1Score = 98.709%

RUF Impact Mitigation Success Rate

We develop the RUF mitigation analyzer of the Rust ecosystem, which scans the Rust ecosystem to reveal the mitigation success rate of our tool (RUF Detector).

Originally, there are 259,540 package versions impacted by RUF, and at most 70,913 package versions suffer from compilation failure in the newest Rust compiler in theory. Applying our compilation failure recovery design, over 90\% (63,935/70,913) of package versions can be recovered from compilation failure.

This points out the effectiveness of our mitigation technique and proves that it can contribute to the reliability and usability of the Rust ecosystem.However, we must add that this is not done once and for all. The RUF itself may contain other potential bugs and isn't supported in other Rust compiler versions. As a result, the ultimate solution to avoid RUF impacts is to stabilize RUF and the development standard of RUF.Our tool can't change the stabilization process and can only select the best recovery point to help developers as much as possible.

RUF Impact Mitigation Validation Results:

Detailed Results of RUF Impact Mitigation Validation:

mitigation_results.xlsx

Page updated

Google Sites

Report abuse