Work Package 2

Repeatability and replicability as foundational values in science.


In light of this clarificatory work, the P.I. will subsequently investigate the epistemic value of the three concepts referred to by the terms ‘repeatability’, ‘replicability’ and ‘reproducibility’.

This section of the project will start with a discussion of Norton’s material analysis of repeatability, replicability and reproducibility (Norton 2014), according to which their epistemic significance can be evaluated only on a case by case basis. Within this view, attempts at formulating a universal principle on the epistemic value of replicability (here used as an umbrella term) are illusory. Judgements on the significance of replicability should be grounded on the background facts of each case (experimental conditions and confounding conditions). Although we agree that it is impossible—and even dangerous—to enshrine a universal formal principle and enforce it without further considerations, it is the belief of the P.I. that general considerations on the value of replicability are still meaningful if the goal is to better understand replicability and its relation with other scientific values.

Being able to recreate a scientific measurement is often regarded as a necessary condition for objectivity, which is a foundational value for scientific research. In light of this, this project will investigate the grounds for the belief that only repeatable[1] instances are worthy of scientific investigation. The first reason that will be investigated is that only repeated instances are regarded as confirmed and so as more likely to be true and thus credible. The second ground is that science aims at universality, and only repeated instances can, via induction, allow generalizations. The third reason relies on the importance of shared information and transmission of scientific knowledge. The project will subsequently investigate the different epistemic significances of replicability and reproducibility.

In particular, the P.I. will take two different routes. First of all, the project will assess the widespread view that repeatability and replicability (but not reproducibility) are necessary conditions for science. Among the defenders of the supremacy of repeatability and replicability over reproducibility, the most important is certainly Popper, for having boldly claimed that single instances have no significance in science and for having emphasized the role of direct replicability in his The Logic of Scientific Discovery. It will thus serve to address this part of the project with a close examination of Popper’s thought. According to him, it is an indisputable requirement that scientists should replicate their own experiments (repeatability) and disclose the details of their experiment to allow other scientists to recreate them (replicability). In his discussion, Popper mentions Kant as the first to emphasize the objectivity of scientific theories, which must be grounded in regularities and inter-subjective phenomena. What is interesting to note is that contrary to common understanding, Popper does not in fact emphasize a failure in replicability as enough to discard a scientific finding as unscientific. Indeed, he writes (1934): “In point of fact, no conclusive disproof of a theory can ever be produced, for it is always possible to say that the experimental results are not reliable”. This means that even for Popper, a lack of outcome replicability is not enough to discard a theory as unscientific. Rather, his emphasis is, according to the P.I.‘s reconstruction, on methodological replicability as the necessary condition for testing an experiment. Only if methodologically replicable does an experiment allow for falsification, which is the Popperian demarcation criterion. In light of this, it would be unfair to criticize Popper’s defense of repeatability by pointing to all those cases in which it is impossible to reproduce the exact same results. However, Popper’s view is definitely vulnerable to the problem of regarding all those measurements of rare and unique materials as unscientific. In this case, Popper bites the bullet, by discarding those cases as constituting a metaphysical, rather than physical, controversy.

Secondly, the project will examine the view that it is reproducibility and not repeatability or replicability upon which science is grounded. This view is supported by the considerations that not only is reproducibility the necessary condition for scientific practice, but also that taking replicability as an overarching epistemic value is dangerous as this would hinder scientific progress across many scientific disciplines. Supporters of the primacy of reproducibility over replicability and repeatability argue that reproducibility is crucial to fostering innovative approaches to building experimental procedure (Romero 2017) and for this reason acquires importance in the long-run. If we continuously repeat a measurement with precisely the same instrument, its confirmation is definitely weaker than recreating the measurement with a totally new procedure which yields the same results. The more different the repeated measurement, the stronger the confirmation. Indeed, reproducibility not only allows for checking the results of previous measurements but also for testing the theory and its instruments for previous experimental hidden assumptions and bugs. It helps, more deeply, to understand the phenomenon under investigation. Thus the lack of replicability is innocuous and insignificant if reproducibility acquires the status of the necessary condition for a measurement to be scientific.

The P.I. will inquire into possible challenges that this position would have to face. The first challenge comes from cases, especially in data science, in which it may be difficult to find new ways to measure. The second challenge is more theoretical. A failure in reproducibility is not very informative, sometimes even uninformative, upon the reliability of the previous experiment. If a failure occurs, indeed, it would be completely unreasonable to discard the previous experiment as unscientific, as the failure may well be due to a novelty involved in the reproduction process.

The differences between the measurement that needs to be reproduced and its reproduction generates another challenge. If the two measurements involve different set ups, different procedure, different indicators and different measurement outcomes, it may be extremely difficult to reach a conclusion. This project will inquire into such cases. Any measurement output consists in two different parts: an instrument indicator and a measurement outcome. Instrument indicators may be numerals, positions of pointers, patterns or charts, and their related outcomes may be qualitative (nominal or ordinal) or quantitative (interval or ratio). Judging whether the instrument indicators are associated with the same outcomes and thus whether the first experiment is reproducible is even more difficult if we consider the fact that, as stated in our discussion on replicability, it is extremely improbable that we achieve an identical result. It seems that in these cases, we lack any objective criterion by which to compare and contrast two different measurements: they differ with respect to the procedures and their outcomes.


[1] Here ‘repeatable’ is used as an umbrella term to cover repeatability, replicability and reproducibility.