For this research project, we began with a dataset containing 40 Java files. All files contained only one class and were between 20 and 50 lines long. The dataset was taken from an existing study regarding the scope of ChatGPT in software engineering. The 40 Java files served as the foundation of our research for all three research questions, as they were the only files we used in our research. We asked ChatGPT to refactor all 40 files eight times to improve eight different quality attributes: performance, complexity, coupling, cohesion, design size, readability, reusability, and understandability.
As demonstrated in the figure at the top of this page, we used prompt engineering to create prompts for our research questions. Research Question 1 consisted of providing ChatGPT with a code segment and asking it to refactor the code to improve one of the eight quality attributes shown in the figure. Research Question 2 used the refactored code from RQ1 and determined whether the behavior was preserved between the original and refactored code segments. To further analyze ChatGPT’s refactored code segments we used the PMD tool to check for any code violations present within the code segments. The PMD violations were categorized into the categories indicated in the figure. Research Question 3 focused on analyzing ChatGPT's capabilities in delivering documentation for the refactored code segment produced in RQ1. This documentation included a detailed description of its intent, instructions, and impact.
Ma, W., Liu, S., Wang, W., Hu, Q., Liu, Y., Zhang, C., Nie, L., & Liu, Y. (2023, May 20). The scope of chatgpt in software engineering: A thorough investigation. arXiv.org. https://arxiv.org/abs/2305.12138
Direct link: https://arxiv.org/pdf/2305.12138.pdf