Quality assurance is of great importance for deep learning (DL) systems, especially when they are applied in safety-critical applications. While quality issues of native DL applications have been extensively analyzed, the issues of JavaScript-based DL applications have never been systematically studied. Compared with native DL applications, JavaScript-based DL applications can run on major browsers, making the platform- and device-independent. Specifically, the quality of JavaScript-based DL applications depends on the 3 parts: the application, the third-party DL library used and the underlying DL framework (e.g., TensorFlow.js), called JavaScript-based DL system. In this paper, we conduct the first empirical study on the quality issues of JavaScript-based DL systems. Specifically, we collect and analyze 700 real-world faults from relevant GitHub repositories, including the official TensorFlow.js repository, 13 third-party DL libraries and 58 JavaScript-based DL applications. To better understand the characteristics of these faults, we manually analyze and construct taxonomies for the fault symptoms, root causes, and fix patterns, respectively. Moreover, we also study the distributions of faults, symptoms, and root causes at different stages of the development lifecycle, the 3-level architecture in the DL system, and the 4 major components of TensorFlow.js framework. Based on the results, we suggest actionable implications and research avenues that can potentially facilitate the development, testing and debugging of JavaScript-based DL systems.
Overview of this work
The following figure shows the overview of our work. We first collect popular Github repositories by the keyword search, including the official TensorFlow.js repository, the 3rd-party DL library repositories using TensorFlow.js and the repositories of DL-based web application based on TensorFlow.js. For each repository collected, we crawl issues that may be related to fixing/discussing relevant problems and construct the candidate dataset for the further analysis. Then we manually filter out issues with unclear descriptions and label the remaining issues. With the labeled issues, we study the 3 research questions, i.e., the symptoms, root cause, and the fix pattern. Specifically, for the symptoms analysis, we first summarize the taxonomy of fault symptoms and then analyze the distribution of symptoms on the 6 stages involved in the development of JavaScript-based DL system. Then we conduct root cause analysis from 3 aspects: 1) summarizing the different types of root causes, 2) analyzing the distribution of these root causes on the 3-level architecture, and 3) analyzing the distribution of these root causes on the components of Tensorflow.js. Finally, we summarize the fix pattern aiming to characterize the solutions to fix the faults.
Due to the page limit, more details on the labeled faults and the study results are shown here: