Abstract:
Software managers and practitioners often must make decisions about what technologies to employ on their projects. They might be aware of problems with their current development practices (for example, production bottlenecks or numerous defect reports from customers) and want to resolve them. Or, they might have read about a new technology and want to take advantage of its promised benefits. However, practitioners can have difficulty making informed decisions about whether to adopt a new technology because there’s little objective evidence to confirm its suitability, limits, qualities, costs, and inherent risks. This can lead to poor decisions about technology adoption, as Marvin Zelkowitz, Dolores Wallace, and David Binkley describe:
Software practitioners and managers seeking to improve the quality of their software development processes often adopt new technologies without sufficient evidence that they will be effective, while other technologies are ignored despite the evidence that they most probably will be useful.
For instance, enthusiasts of object-oriented programming were initially keen to promote the value of hierarchical models. Only later did experimental evidence reveal that deep hierarchies are more error prone than shallow ones.
The aim and methodology of EBSE:
EBSE aims to improve decision making related to software development and maintenance by integrating current best evidence from research with practical experience and human values. This means we don’t expect a technology to be universally good or universally bad, only more appropriate in some circumstances and for some organizations. Furthermore, practitioners will need to accumulate empirical research about a technology of interest and evaluate the research from the viewpoint of their specific circumstances.
This aim is decidedly ambitious, particularly because the gap between research and practice can be wide. EBSE seeks to close this gap by encouraging a stronger emphasis on methodological rigor while focusing on relevance for practice. This is important because rigor is necessary in any research that purports to be relevant. Moreover, because most SE research hasn’t influenced industrial practice, there’s also a pressing need to prevent SE research from remaining an ivory tower activity that emphasizes academic rigor over relevance to practice.
So, although rigor is a necessary condition for relevant SE research, it isn’t sufficient. Medical evidence is based on rigorous studies of therapies given to real patients requiring medical treatment; laboratory experiments aren’t considered to provide compelling evidence. This implies that SE shouldn’t rely solely on laboratory experiments and should attempt to gather evidence from industrial projects, using observation studies, case studies, surveys, and field experiments. These empirical techniques don’t have the scientific rigor of formal randomized experiments, but they do avoid the limited relevance of small scale, artificial SE experiments.
Furthermore, there are substantial problems with accumulating evidence systematically, and not only because accumulating evidence from different types of studies is difficult. A specific challenge in practicing EBSE is that different empirical studies of the same phenomenon often report different and sometimes contradictory results. Unless we can understand these differences, integrating individual pieces of evidence is difficult. This points to the importance of reporting contextual information in empirical studies to help explain conflicting research results.
EBSE involves five steps:
In the SE context, factors to consider when deciding which question to answer first include these:
Discussion:
Although it’s important for software practitioners to base their choice of development methods on available scientific evidence, this isn’t necessarily easy. EBM arose because medical practitioners were overwhelmed by the large number of scientific studies; in SE our problems are rather different. There are relatively few studies, as our pair-programming example (see the “Asking the Right Question” sidebar) showed. Furthermore, when evidence is available, software practitioners still have difficulty judging the evidence’s quality and assessing what the evidence means in terms of their specific circumstances. This implies that given the current state of empirical SE, practitioners will need to adopt more proactive search strategies such as approaching experts, other experienced practitioners, and researchers directly.
Because a basic idea behind EBSE is to establish a fruitful cooperation between research and practice, a closer link should exist between research and practice so that research is relevant to practitioners’ needs and practitioners are willing to participate in research.
You might have noticed that we’ve offered no evidence of EBSE’s benefits. Although we have no examples of other practitioners using EBSE, the sidebar “Evidenced-Based Software Engineering Q&A” presents examples of our own use of EBSE. On the basis of this experience, and other ongoing industrial and educational initiatives in which we’re engaged, we believe that evidence-based practice is possible and potentially useful for software practitioners.
However, evidence-based practice also places requirements on researchers. We recommend that researchers adopt as much of the evidence-based approach as is possible. Specifically, this includes being more responsive to practitioners’ needs when identifying topics for empirical research. Also, it means improving the standard both of individual empirical studies and of systematic reviews of such studies. Researchers need to perform and report replication studies in order to accumulate reliable evidence about SE topics. Researchers also need to report their results in a manner that’s accessible to practitioners.