One of the key tasks related to ensuring mobile app quality is the reporting, management, and resolution of bug reports. As such, researchers have committed considerable resources toward automating various aspects of the bug manage- ment process for mobile apps, such as triaging and reproduction. However, the success of these automated approaches is largely dictated by the characteristics and properties of the bug reports they operate upon. As such, understanding mobile app bug reports is imperative to drive the continued advancement of bug report management techniques. While prior studies have examined high-level statistics of large sets of bug reports, we currently lack an in-depth investigation of how the information contained in bug reports relates to the actual information necessary to reproduce the bug reports.
In this paper, we perform an in-depth analysis of 180 re- producible bug reports systematically mined from Android apps on GitHub and investigate how the information contained in the bug report relates to the task of reproducing the reports. In the analysis, we focus on three pieces of information: the environment needed to reproduce the bug report, the steps to reproduce (S2Rs), and the observed behavior. Focusing on this information, we characterize failure types, identify the modality used to report the information, and characterize the information quality within the reports. We find that bug reports are reported in a multi-modal fashion, the environment is not always provided, and S2Rs often contain missing or ambiguous information. These findings carry with them important implications on automated bug reproduction techniques and more generally automated bug reporting approaches, and provide a detailed discussion of those implications to guide practitioners and future research.
The data associated with this project is available in this repo:
https://github.com/se-umn/2022_saner_bug_report_reproduction_study