Post date: Jul 25, 2013 1:37:39 PM
This morning, I woke up to this opinion piece about ‘Science in Chains’ and I reacted a little harshly on Twitter, as Jon Simons pointed out correctly. I’m really sorry if that rubbed anyone the wrong way (it’s a really bad habit I have), but I could just hear people say: ‘there are also people who question the use of pre-registration, so let’s not adopt it, and continue with business as usual.’ And if that Times Higher Education piece is used in such a way, well, I would find that extremely frustrating. It is good to have a discussion about the topic – if only the be able to take away some of the concerns, as well as to work out practical issues in implementing pre-registration. And the piece seems to be leading to such a discussion, so that is great.
The major points the article raises against pre-registration seems to be that 1) pre-registered studies are not necessarily closer to the truth, and 2) pre-registration is not always possible (e.g., in single case studies).
Is the outcome of a pre-registered study more likely to be true? In principle, no. James Kilner explains why, in a perfect world, where scientists are not only completely honest, but, I note, also in complete control of their unconscious biases, a type 1 error is a type 1 error – whether it came from a pre-registered study, or not.
However, it is not true that scientists are perfect, and that the publishing system completely unbiased. We don’t know how big the problem is, but that is definitely no reason to ignore it. Is pre-registration the end-all solution to our problems? No. Will pre-registered studies, in the long run and on average (this is important!), stay closer to the a 5% Type 1 error rate? It is extremely likely, given the current reward structures in science (e.g., getting tenure if you publish a lot of significant findings). Will the findings be closer to what Omniscient Jones (see Meehl, 1990) knows to be the truth? On average, yes, because of the lower Type 1 error rate (although I could not predict by how much). Even if the benefit is small, it is amplified by low statistical power (another major problem in psychology). Let’s calculate:
A recent meta-analysis revealed that the average likelihood of finding a statistically significant effect (if a real effect exists in the population) in neuroscience is estimated to be around 21% (Button et al., 2013). Let’s imagine we examine 200 novel hypotheses, of which ‘Omniscient Jones’ knows 100 examine a true hypothesis, and 100 examine a false hypothesis (see Ioannidis, 2005, for another example). We have an estimated statistical power of 21%, which means we can expect to observe significant effects in 21 of the 100 studies that examine true effects. We publish only significant effects (welcome to psychology). For the 100 false hypotheses, we find 5% false positives, and we publish these 5 studies. This means that purely based on these statistics, and assuming that an hypothesis is equally likely to be true or false, the ratio of true effects to false effects in the published literature is 21/5, or 4.2/1. Now let’s say that flexibility in the way you analyze data boosts the Type 1 error rate to 10.5% in studies that are not pre-registered (not unlikely, see Simmons, Nelson, & Simonsohn, 2011). Now, the ratio of true to false effect in the literature is 2/1 (for a more detailed and complex example, see Greenwald, 1975).
A little error control goes a long way , even though this is not the only problem (psychology needs more applicability of results, more replications (and extensions) to facilitate cumulative science, no publication bias, higher statistical power, and better theories, to begin to look a little like a decent scientific discipline, see Lykken, 1991).
What about the costs? Does it slow research to a crawl? It depends. If you are a Fyerabend fan, and think that ‘anything goes’ in the philosophy of science, then there are many faster routes to get publications. If the robustness of knowledge matter to you, and you would like to be able to predict something, or at least be pretty sure that you can repeatedly observe it, pre-registration in itself is not the issue. It doesn’t take that long (and to be honest, I make my students do it most of the time: http://openscienceframework.org/project/J9KRM/registrations). Sure, the review process that is involved in some of the pre-registration formats journals have will take time. But that’s only one way to pre-register, but not the only way. The Times Higher Education piece does not distinguish between different types of pre-registration. The one where you pre-register, submit it, and then collect data is a great way to get null-results published. Using the Open Science Framework is a perfectly acceptable way as well, if you just want to distinguish confirmatory from exploratory research.
What about other costs? Does pre-registration ‘stifle creativity’? You can say yes, but only if you really, really don’t understand the concept. We are talking about hypothesis TESTING, not generation. If you look through your data, and calculate 400 correlations, and see 1 that has your fancy, run with it. To the lab, preferably, not to a journal, although there is also enough room in theoretical articles to share any wild hypothesis you have no matter how scarce the empirical support. You can use exploratory research to generate new hypotheses. It’s also great if you report the data as Study 1, and then pre-register and report Study 2 (and if it was a really good idea, you can publish the paper regardless of whether Study 2 shows the expected effect, as far as I’m concerned). But it is not possible to test an hypothesis in exploratory research (De Groot, 1961), because you yourself cannot quantify the increase in Type 1 error rate. Really, you can’t. And therefore, you don’t know how robust the finding is. So say: ‘this stuff from my exploratory analysis might be a great idea for future research’ but don’t say ‘this exploratory analysis confirms my prediction.’
Do you always have to pre-register? No! For example, I love some single individual case studies. They can be a huge inspiration for new hypotheses. But they are anecdotes. They might be true, and theories derived from them might lead to progressive research lines (Lakatos, 1970). But the reliability of the individual case study is difficult to quantify.
Finally, pre-registration is nothing more than a reasonably reliable guarantee that the ideal empirical cycle was followed as closely as possible to test a hypothesis. And in that very narrow and specific area of psychology, I’d be more than happy to, at least most of the time, put science in chains.
If you have comments, or think something in my argument is clearly faulty, you can contact me @Lakens or send me an e-mail.