In this research, laboratory experiments were performed in a highly controlled environment, following a previously defined procedure and protocols to enable accurate measurements (Balijepally et al., 2009). To note that experiments should be objective so that there is no bias in the results (e.g., due to the researchers’ influence/ perspective) (McLeod, 2012).
The underlying research question was: “Is software development productivity enabled by Low-Code technologies higher than Code-Based technologies (as reported in the grey literature)?” The variable under study was productivity in the creation and maintenance of software applications.
For each experiment, a software development technology was selected (Code-Based, Low-Code, and Extreme Low-Code (quasi no code)), and a developer with proven proficiency in that technology was invited. In the case of Code-Based technology, the developer could select the technology he preferred. Productivity calculation was based on the Use Case Points Analysis (UCPA) method (Ochodek et al., 2011).
The artificial and controlled environment of the experiments made it possible to accurately measure execution times, which is something impossible to perform in other types of studies such as field experiments, in which it is not viable to control external stimuli that condition the performance of tasks (Wenz, 2021).
The experiments were structured into the five stages described as follows: Stage 0 – Experiment design; Stage I – Briefing; Stage II – Software Application Development (Creation); Stage III – Software Application Development (Maintenance); Stage IV – Results analysis. Stages I, II, and III were repeated for each technology involved in the experiments.
Stage 0 was carried out only once, as preparatory phase for the various experiments to be performed. During this stage, the procedure to be followed was defined, the protocols that specify the application to be developed and maintained (structured in two stages) were created, and the methods to be used to estimate and measure productivity were specified. The UCPA method was chosen from the several possible alternatives (e.g., lines of code (Pressman and Maxim, 2019), COCOMO II (Sommerville, 2018), Function Point Analysis (Lokan, 2005), etc.), due to its focus on the functionalities of the applications to be developed and independence of the technology to be used (which, in the case of the defined experiments, is fundamental).
The method comprises the following phases (Nageswaran, 2001; Clemmons, 2006; Ochodek et al., 2011):
1) Calculation of the UUCP (Unadjusted Use Case Points) variable, using the variables UAW (Unadjusted Actor Weight) and UUCW (Unadjusted Use Case Weight), respectively related to the perceived complexity of actors and use cases:
UUCP=UAW+UUCW (1)
2) UUCP adjustment, considering a set of factors of technical and environmental nature reflected into the variables TCF (Technical Complexity Factor) and EF (Environmental Factor). The combination of variable UUCP with variables TCF and EF results in the assessable UCP (Use Case Points) of the Project:
UCP=UUCPxTCFxEF (2)
3) Finally, the UCP variable is multiplied by the PF (Productivity Factor), which represents the number of hours necessary for the development of each UCP:
Total Effort=UCPxPF (3)
Thus, taking the UCPA as a reference, in the experiments carried out it was calculated the PF variable — the lower the resulting PF, the higher the productivity of the technology under study.
The experiment was structured in two main parts: the first part (stage II) aimed at creating a software application, and the second part (stage III) consisted in the maintenance (corrective and evolutionary) of the application created in the first part of the experiment.
Tables 1, 2, and 3 identify the actors and use cases described in the experiment protocols, as well as their respective scores (Weight).
For the first part of the experiment (creation of a software application), the value of 1 was set for TCF, considering the low application complexity. Given that the purpose of the experiment was to determine the EF value for each technology, the starting point for calculating the UCP variable was also set at 1. Thus, for the first part (Stage II) of the experiment:
UUCP=UAW+UUCW=125+9=134 (4)
UCP=UUCPx1x1=134x1x1=134 (5)
For the second part of the experiment (maintenance), participants were asked to make two changes (corresponding to a total of 20 points (Weight)), and to implement new use cases, as shown in appendix C. Thus, in total, for the second part (Stage III) of the experiment:
UUCP=UAW+UUCW=40+9=49 (6)
UCP=UUCPx1x1=49x1x1=49 (7)
Throughout each experiment, a researcher was always present. Whenever requested by the developer, additional clarifications were provided on the application to be developed. It should also be noted that the experiments were fully recorded on video for subsequent analysis. Break times (e.g., for meals) were registered but not considered for productivity calculation. During the experiments, the developers could access all the information they needed; the only restriction was not to contact other developers for help.
Stage I was preparatory and consisted in presenting the protocol (of Stage II) and the conditions for conducting the experiment to the developer. The different use cases were presented in detail, as well as the mockups and data model requirements. The degrees of freedom were also defined — for example, regarding the color scheme of the graphical interface. The importance of the final application being as close as possible to the mockups was duly stressed, as well as the need for strict compliance with the specifications — resisting the “temptation” that “it would be better in any other way” —, since a quality assessment in the final stage of the experiment was planned to consider these very aspects. Time counting only started after the completion of this phase.
After completion of stage I (briefing), the second stage began. During this stage, the objective was to create a new application, following the process defined for the first part of the experiment. As planned, the developer’s activities were recorded on video, and one of the team’s researchers was always present during the development. Besides the programming corresponding to the defined use cases, the activities performed by the developer included the configuration of the development environments used, the creation of databases, and testing. It should be noted that the complementary activities varied significantly depending on the development technology used.
Stage III followed the same procedure as stage II, with the difference that, in this case, the objective was not the creation of a new application but the maintenance (corrective and evolutionary) of an existing application (the application created in stage II). Moreover, the activities were based on a new protocol (see appendix C), which was only made available after completing stage II — i.e., in stage II, the developers were not aware of the protocol for stage III.
After completing the experiments, the time records (registered manually) and the videos of the activities performed were checked to ensure the accuracy of the time counting. Furthermore, to promote greater accuracy in the calculation of the productivity made possible by each technology, a quality assessment of each resulting application was performed with the participation of at least two researchers, considering four fundamental criteria: compliance with the mockups; fulfilment of the functionalities as described in the use cases; occurrence of errors; and application performance. It should be noted that, although quality assessment of the various applications resulted in minor differences in the final productivity calculated, this had no significant expression in productivity differences between the various technologies that were part of the experiments or in the overall conclusions of the study.
Balijepally, V., Mahapatra, R., Nerur, S., and Price, K. H., “Are Two Heads Better than One for Software Development? The Productivity Paradox of Pair Programming”, MIS Quarterly, vol. 33, no. 1, 91-118, 2009.
Clemmons, R. K., “Project estimation with use case points”, The Journal of Defense Software Engineering, vol. 19, no. 2, pp. 18–22, 2006.
Lokan, C., “Function Points”, Advances in Computers, vol. 65, 297-347, 2005.
Nageswaran, S., “Test effort estimation using use case points”, Quality Week, 2001.
McLeod, S., “Experimental Method”, Simplypsychology.org, 2012.
Ochodek, M., Nawrocki, J., and Kwarciak, K., “Simplifying effort estimation based on Use Case Points”, Information and Software Technology, vol. 53, no. 3, 200-213, 2011.
Pressman, R. and Maxim, B., Software Engineering - A Practitioner’s Approach, 9th ed., McGraw-Hill Education, 2019.
Sommerville, I., Software Engineering, 10th ed., Pearson, 2018.
Wenz, A., “Do Distractions During Web Survey Completion Affect Data Quality? Findings From a Laboratory Experiment”, Social Science Computer Review, vol. 39, no. 1, 148-161, 2021.