Data Science

A Negative Result IS A Positive Result: Changing the Computation and Data Science Culture

By Khalid Belhajjame

Computational and data science managed to become in few decades the third and fourth scientific paradigm, thereby completing the spectrum already drawn by the well established by the theoretical and experimental paradigms. Advances in the methods and techniques used for performing large-scale data driven experiments speed up productivity in science. To increase trust and to enable the understanding and ultimately of such results, we have witnessed recently the emergence of a strong reproducibility movement. While there is not a single widespread standard of what reproducible computational science is, there is a widespread agreement that information about the code and data used to generate and or interpret published results should be made available.  I believe that such movements will ultimately promote better ways of doing sciences. However, I argue that a deeper change in the culture of doing computational and data science is needed. Specifically, so far the focus in sciences has been on publishing positive results, by which I means results that shows improvements over the state of the arts.

Nonetheless, negative results have also their place in he scientific ecosystem, and they can help (at least speeding) advances in sciences. As Joh Clearmount mentioned “From failures, we learn the most”. Modern sciences are mainly driven by positive results. Conferences, journals and funding bodies somewhat compel scientists to report only positive results. Even more, a colleague of mine mentioned that negative results would be seen by a funding body as a waste of resources, and therefore they should be hidden. Even worse, if the scientist needs to report results, because of contractual engagement towards their funding body or university, then they may present the positive side of their research and hide the limitations or the negative aspects of it. It is not surprising then that some of the research findings that are reported in the literature are distorted.

In this blog post, I would like to join my colleagues and advocate the importance of negative result dissemination. I would go even further by claiming that the absence of negative results in the scientific literature should raise suspicions, and be seen as a sign (metric) for detecting unhealthy and irreproducible research.