RQ1: Vulnerability Impact Analysis

RQ1 measures to what extent the Golang ecosystem could be influenced by TPL vulnerabilities.

DataSet of RQ1

We gathered the dependencies relations from the BigQuery dataset and the vulnerability data from Snyk and NVD. Subsequently, we conducted an in-depth analysis of these datasets, leading to the discovery of Finding 1 and Finding 2.

The dataset containing dependency relations is notably extensive, comprising a substantial 476 million entries and occupying a considerable 122 GB in size. 

The dataset comprising information on vulnerabilities encompasses 1269 entries, with a file size of 1718 KB. This dataset includes details such as the CVE-ID, reference links, affected modules, and associated repositories. 

The repository information dataset comprises 441 entries and occupies 6580 KB in size. This dataset includes information related to vulnerabilities, commits, issues, and tags associated with the repositories.

Findings

Finding-1: The vulnerabilities affected a significant number of downstream dependents (479,411 66.10%) by the data collection date. 62.85% of the dependents have still not fixed the vulnerabilities.

This finding utilizes the Dependencies Relations dataset, Vulnerabilities Information dataset, and Repositories Information dataset. It leverages the Vulnerabilities Information dataset to identify the associated dependents via the Dependencies Relations dataset, and uses the Repositories Information dataset to determine whether they are affected by the vulnerabilities.

Finding-2: Even for vulnerabilities of 2019 and before, the affected dependents have been increasing over the years while keeping the proportion of affected vulnerabilities steady.

This finding utilizes the Dependencies Relations dataset, Vulnerabilities Information dataset, and Repositories Information dataset. It leverages the Vulnerabilities Information dataset to identify the associated dependents via the Dependencies Relations dataset and utilizes the Repositories Information dataset to determine whether they are affected by the vulnerabilities, as well as when they are not affected by vulnerabilities.