M. L. de Jesus Souza, A. R. Santos, I. C. Machado, E. S. de Almeida and G. S. S. Gomes, “Evaluating Variability Modeling Techniques for Dynamic Software Product Lines: A Controlled Experiment,” In Software Components, Architectures and Reuse (SBCARS), 2016 X Brazilian Symposium on (pp. 1-10). IEEE.
Table 1 describes the main aspects of the study design. The study aimed to analyze Variability Modeling Techniques (VMT) for Dynamic Software Product Lines for the purpose of evaluating them with regard to its effectiveness and efficiency from the viewpoint of SPL researchers in the context of Undergraduate, M.Sc. and Ph.D. students modeling a Smart Home DSPL project.
In this work we evaluated two VMTs by means of a controlled experiment, Context-aware Feature Modeling (CFM) [1], which enriches the traditional Feature Model with Context Model (see the Figure 1), and the Tropos Goal Modeling with Contexts (TGMC) [2], which extends the Tropos Goal Model by adding context requirements to capture the relationship between context and variability (see the Figure 2).
To achieve our work goal, we identified two main research questions:
· RQ1. Which is the most effective DSPL variability modeling technique?
· RQ2. Which is the most efficient DSPL variability modeling technique?
Figure 1 Example of Context-aware Feature Modeling
Figure 2 Example of Tropos Goal Modeling with Contexts
To answer these research questions, we defined the following metrics:
· M1. Effectiveness PRECISION (EP). It aims to assess the precision of the results. We consider precision as the number of correct elements identified (true positive - TP) by the subjects over the total number of identified elements (true positives - TP and false positives - FP). Precision values range between 0% and 100%.
· M2. Effectiveness RECALL (ER). It aims to assess the recall of the results. We consider recall as the number of correct elements identified (true positives - TP) by the subjects over the total number of correct elements (true positives - TP and false negatives - FN). Recall values range between 0% and 100%.
· M3. Efficiency PRECISION (EPT). It aims to assess the time spent (TS) for modeling based on the precision value. In this case, we needed to create a variable in order to relate the precision with the time. This relation has the purpose to validate the efficiency measure, i.e., to indicate that, besides the time, the subjects have modeled either correctly or incorrectly each technique. A higher EPT implies a better modeling time regarding precision.
· M4. Efficiency RECALL (ERT). It aims to assess the time spent (TS) for modeling based on the recall value. As in M3, we needed to create a variable in order to relate the recall with the time spent for modeling. A higher ERT implies a better modeling time regarding recall.
We defined a set of four different groups of null and alternative hypotheses, since each group is related to the metrics. Each group has a null hypothesis where the values for both techniques are equivalent, and an alternative hypothesis where the values are different. Alternative hypotheses are divided in two others sub-hypotheses, where the first one represents a case where the value from technique A is higher than technique B and the second one represents the case where the technique B is higher than A. The study hypotheses are summarized as follows:
• H1.0: Measure1TA = Measure1TB
• H1a: Measure1TA <> Measure1TB
– H1a1: Measure1TA > Measure1TB
– H1a2: Measure1TA < Measure1TB
The feedback form was composed by closed and open questions. Closed questions were divided in multiple and agree-disagree choices ranging from totally disagree to totally agree. Among the closed questions, we asked to the subjects about robustness, easiness of use and easiness of learning between CFM and TGMC techniques. The answers are showed in Figure 3.
Figure 3 Closed Questions from the Feedback Form
Table 2 presents the descriptive measurements such as median, mean, standard deviation and coefficient of variation from both techniques according to precision and recall measures. These values illustrate the effectiveness results for CFM and TGMC techniques. The Mann-Whitney U and Student’s t hypotheses tests revealed that only the precision variable presented a significant difference to the level of 5% with p-value of 0.0036 and 0.0009 respectively. It means that we could only refute the precision null hypothesis.
Figure 4 shows the boxplots graphics of both techniques to the effectiveness results according to precision and recall values.
Figure 4 Effectiveness - Boxplot
Table 3 presents the descriptive measurements from both techniques according to precision over time and recall over time measures. These values illustrate the efficiency results for CFM and TGMC techniques. Both precision over time and recall over time variables did not present a significant difference for both Mann-Whitney U and Student’s t-test to refute the null hypothesis.
Figure 5 shows the boxplots graphics of both techniques to the efficiency results according to precision and recall values.
Figure 5 Efficiency - Boxplot
References
[1] K. Saller, M. Lochau, and I. Reimund, “Context-aware DSPLs: model-based runtime adaptation for resource-constrained systems,” Proc. 17th Int. Softw. Prod. Line Conf. co-located Work., pp. 106–113, 2013.
[2] R. Ali, R. Chitchyan, and P. Giogini, “Context for Goal-level Product Line Derivation,” Third Int. Work. Dyn. Softw. Prod. Lines, pp. 24–28, 2009.