Third-party libraries with rich functionalities facilitate the fast development of JavaScript software, leading to the explosive growth of the NPM ecosystem. However, it also brings new security threats that vulnerabilities could be introduced through dependencies from third-party libraries. In particular, the threats could be excessively amplified by transitive dependencies. Existing research only considers direct dependencies or reasoning transitive dependencies based on reachability analysis, which neglects the NPM-specific dependency resolution rules as adapted during real installation, resulting in wrongly resolved dependencies. Consequently, further fine-grained analysis, such as precise vulnerability propagation and their evolution over time in dependencies, cannot be carried out precisely at a large scale, as well as deriving ecosystem-wide solutions for vulnerabilities in dependencies.
To fill this gap, we propose a knowledge graph-based dependency resolution, which resolves the inner dependency relations of dependencies as trees (i.e., dependency trees), and investigates the security threats from vulnerabilities in dependency trees at a large scale. Specifically, we first construct a complete dependency-vulnerability knowledge graph (DVGraph) that captures the whole NPM ecosystem (over 10 million library versions and 60 million well-resolved dependency relations). Based on it, we propose a novel algorithm (DTResolver) to statically and precisely resolve dependency trees, as well as transitive vulnerability propagation paths, for each package by taking the official dependency resolution rules into account. Based on that, we carry out an ecosystem-wide empirical study on vulnerability propagation and its evolution in dependency trees. Our study unveils lots of useful findings, and we further discuss the lessons learned and solutions for different stakeholders to mitigate the vulnerability impact in NPM based on our findings. For example, we implement a dependency tree based vulnerability remediation method (DTReme) for NPM packages, and receive much better performance than the official tool (npm audit fix).
DVGraph: We design and construct a complete and precise DVGraph for the whole NPM ecosystem {by leveraging a robust dependency constraint parser. The construction and maintenance pipelines take 20 person-months.
DTResolver: We propose a novel algorithm (DTResolver) based on DVGraph to statically and precisely resolve the dependency trees for any installation time with high accuracy (over 90%), which is validated by around 100k representative packages.
empirical study: We conduct the first large-scale empirical study based on over 50 million resolved dependency trees (calculated on an 8-core machine for one month) to peek into the vulnerability propagation and the evolution of vulnerability propagation over time and provide useful findings.
Implication and applicable solutions: We provide an in-depth discussion, including lessons learned and actionable solutions, which provide useful insights to improve the security of the whole NPM ecosystem for different stakeholders, such as the proposed remediation (DTReme) that excludes more vulnerabilities than the official tool (npm audit fix).
RQ1: Vulnerability Propagation via dependency trees
RQ1.1:How many packages are affected by existing known vulnerabilities in the NPM ecosystem?
RQ1.2: How do vulnerabilities propagate to affect root packages via dependency tree?
RQ2: Vulnerability Propagation Evolution in Dependency Trees
RQ2.1: How does known vulnerability propagation evolve over time?
RQ2.2: How long do vulnerabilities live in dependency trees?
RQ2.3: Why is there still a considerable portion of CVEs not removed?
RQ2.4: Example of remediation by avoiding vulnerability introduction.
PRQ1: Dependency Tree Complexity
PRQ1.1: How large are dependency trees, especially transitive dependencies?
PRQ1.2: What does NPM dependency resolution bring to dependency trees?
PRQ2: Dependency Tree Changes
PRQ2.1: How frequently does the dependency tree change imperceptibly?
PRQ2.2: How does the dependency tree change over time?
PRQ3: Pre-study on measures taken by library maintainers when encountering vulnerabilities.
DVGraph
Restful APIs for DTResolver and DTReme