MOTIVATION
Microsoft reports that 47,000 developers introduce about 30,000 bugs per month, highlighting the error-prone and expensive processes involved in building, evolving, and maintaining complex software systems. These activities are human-intensive and performed in large teams, with each developer's characteristics determining the success or failure of an entire project. Automated or semi-automated recommender systems (RSSEs) have been introduced to support developers, such as defect prediction approaches, code completion techniques, and code readability prediction. However, most modern RSSEs are inherently opaque, undermining the trust developers might have towards their recommendations.
Open-source developers typically collaborate using Version Control Systems (git) and coding platforms like GitHub and GitLab. Extracting valuable information about developers could help managers allocate resources better and improve existing RSSEs. Defect prediction models could take into account developers' information to better assess the likelihood of a component presenting bugs, while code completion tools could provide personalized suggestions based on code styles more familiar to the specific developer. Code readability prediction approaches could aim at assessing code readability according to specific developer types rather than universally.
Another weakness of currently available RSSEs is their disconnect from what developers might actually expect or desire. For example, readability prediction approaches, which detect unreadable code snippets, have significant limitations in terms of user experience. Developers cannot choose the granularity level of the predictions, when they should be executed, or what should be done with the prediction. Additionally, available approaches do not explain why RSSEs recommendations are provided in a certain way, leading to developers not trusting them and not finding them useful in practice. Defining developer-centered RSSEs could provide developers with a tailored experience and more trustable suggestions that developers might adopt with higher awareness.
GOAL
This project aims to place developers at the center of RSSEs by defining strategies for extracting their profiles and making RSSEs more usable for developers. By relying on publicly available data from open-source projects, the project will use data profiling strategies to discover developers' information related to their experiences. This information will be used to automatically infer metadata, such as statistics, patterns, and data dependencies, which will be used to characterize developers' profiles.
The project will also introduce AI-based solutions for automatically tailoring developers' experiences based on their profiles. End-User Development solutions will allow developers to further customize their working environment according to their preferences, needs, and projects. To improve developers' trust in the model outcomes, explanation mechanisms of the AI models and their outcomes will be provided.
Finally, three novel Profile-Based and Developer-Centered RSSEs will be defined, with two for detecting quality issues and one for source code generation. These approaches will be empirically validated through controlled experiments and case studies in industrial settings.
APPLICATION SCENARIO
We consider the following scenario that includes two examples of how the proposed methodology could benefit the two main end-users of DevProDev: software developers and software engineering researchers.
DevProDev for Software Developers. Sam is a software developer at XCorp, a software company that mainly produces web applications for other companies. Sam uses the IDEA IntelliJ IDE to write code and is currently working on the backend of a web application for XBank. Sam has worked on many other projects in the same organisation and outside as an open source developer. Sam installs DevProDev as an IntelliJ plugin to have the latest RSSEs. Sam loads his development history into DevProDev by simply providing his GitHub username. DevProDev extracts Sam's profile, a set of directly available and inferred characteristics, so that it can use this information to provide Sam with more accurate and personalised predictions through a set of profile-based RSSEs defined by the research community. Because DevProDev is adaptive by design, it automatically enables and disables some RSSEs based on Sam's profile. Based on such a profile, DevProDef finds that (i) Sam's contributions are often directed at writing test cases and fixing bugs reported by end users, (ii) he does not care about code readability, so other developers often modify his code to improve such an aspect. Among the other modules that DevProDev provides, there are two that might be useful to Sam: one that detects possible defective code components (defect prediction) and one that provides suggestions on how to fix buggy code (automated bug fixing). DevProDev understands that Sam might benefit from such modules to speed up his work and improve his code, and it enables them by default. DevProDev also allows Sam to further customise the working environment according to his preferences, needs and projects, using visual interaction mechanisms that do not require any coding or specific knowledge of the AI algorithms (adaptability). First, Sam can manage the components. In this case, he decides to add a third module on code readability (readability prediction). Second, DevProDev allows Sam to customise the behaviour and appearance of each component by choosing RSSE-specific options (e.g. the level of granularity of the prediction) and the visualisations to be used to display their results. For the readability prediction component, Sam decides to use method-level readability prediction. He also chooses to display the readability prediction in a widget, which he customises by including a Gauge graph that summarises the readability, and by adding text below the graph that details the readability dimensions (e.g. number of labels, character alignment, coherence between comments and labels). Third, Sam can choose where DevProDev makes suggestions. For example, he likes the predictions made by the automated debugging module, but he does not want the IDE to automatically change the code. So he chooses to have the suggestion(s) in a separate widget so that he can carefully analyse them, adapt them to his coding style, and eventually integrate them into the code base. DevProDev also provides Sam with powerful explanation mechanisms for the results of the RSSEs, based on AI models. The explanation describes, for example, why a particular artefact is marked as "unreadable". Specifically, it explains that the number of identifiers of a given method are not coherent enough with the comments, and that some lines of code are too long. This level of explanation helps Sam to trust the model output and make more informed decisions about accepting or rejecting model proposals.
DevProDev for Software Engineering Researchers. Billie is a software engineering researcher and her main research interests are related to code readability. Specifically, Billie's goal is to improve currently available models for predicting code readability by taking developer profiles into account. Her hypothesis is that It might be possible to say whether a particular developer would judge a given code snippet as readable or unreadable by including developer-related features in the readability prediction model. Billie needs to define features based on developer characteristics (i.e. their profiles), but there are many different alternatives for measuring even the most seemingly simple aspect (e.g. programming experience). DevProDev provides Billie with rigorous metrics for several aspects of developers, as well as APIs for measuring such aspects directly, given the developer's GitHub username. Billie uses the relevant characteristics of the developers (e.g. the number of comments they usually write) as features to easily define a novel readability prediction approach that is more accurate than the state-of-the-art. Billie decides to publish her DevProDev module so that developers can benefit from it.