These are edited, detailed responses to Chris Chamber's article on restoring trust in science with pre-registration of studies (http://www.guardian.co.uk/science/blog/2013/jun/05/trust-in-science-study-pre-registration) which people sent me when I circulated a draft of my response (http://www.timeshighereducation.co.uk/comment/opinion/pre-registration-would-put-science-in-chains/2005954.article).
Many people made comments on my draft document about the pre-registration issue, and it was not possible to include all their points (due to space) nor acknowledge them by name (as that’s not THE style). With their permission I have put edited versions here (some people preferred to be anonymous): please note these are just people's emailed comments, so some of them are informal in tone.
From Nicole Maxwell
I am a dual major, biology and psychology, so I have the advantage of seeing practices in both fields. I work in the field and in medical research, as well as with mammalian and reptilian models. It has been pointed out to me numerous times that my direction and my desire to remain open to varying aspects of multiple fields is a detriment to my success. I fear that pre-registration will only make that worse, below are the reasons why.
As I am not easily swayed by power and position of my respective authority figures, I have a thicker skin than many of my younger counterparts and peers. I used to believe my age was a disadvantage until I saw certain aspects of academia. College has become a race to the finish with the most impact to gain a sense of pride and accomplishment. It is no longer about learning to develop methods, ideas, or discover passions in science. These are to have already been figured out before entering university, here in America. It’s a vast detriment to students.
The push to publish is astronomical here, as I assume it might be in the UK as well. I have applied for nearly 100 jobs since my graduation six months ago and because I am not published or because my university lacked a clinical component to my learning, I am simply undesirable or not worth the effort to train. No longer can I graduate from university with the idea that I am still fresh and to learn more. I am to be near graduate level expectations, without the education to support it. Failed experiments, non-significant analyses, or even a not-so-current idea are scorned or ignored. I cannot build my career if I cannot see what has failed before me or make my own mistakes to learn from.
Perfection is the desired component of science in both clinical and experimental design. Perfection is desired everywhere, but we are not perfect beings nor will our methods ever be confound free. The idea that as a freshly minted scientist I should be able to develop an entire design, from beginning to end, and then be locked in to publish with that journal, means I loose the freedom to be imperfect and to learn from mistakes. The pressures of those demands add even more stress to the already high stakes. Only if I look good on paper, will I be accepted by my peers and be able to qualify for an entry-level position. The expectations have exceeded the training and few of us can keep up or handle the amount of pressure.
There is rarely support for designs that fail, mistakes that occur, and how to cope with a failed experiment, that new scientists are struggling. My psychology professors advised us not to exclude data, but upon asking for their help with a situation, they excluded it for us. I cannot learn how to include all of my data if my basics are corrupted. My biology professors advised the same, yet they too would throw out or ignore bad samples. Some wouldn’t even write it in their lab notebooks. Thankfully, I understand the value of mistakes and use waterproof ink in my lab notebooks. I found several slight changes in my previous, but similar experiments, that resulted in significant changes to my data (while I was pursuing my own non-funded and non-university-supported research projects). Without these mistakes, without the slight changes, I never would have learned as much as I have and feel capable of taking on a graduate program. I fear pre-registration will mold the younger generations into willy-nilly scientists. Our health depends on stringency and integrity.
I also fear that the mass media has become too influential in science. I regularly read articles that are sensationalized, blown out of proportion for the findings, and non-scientists take it as truth, which makes it even more difficult for new scientists to get support from non-science family members. I fear if pre-registration is required, the media will get ahold of articles before the data comes in and there will be more of a demand to be perfect, to use questionable practices to create the data to support the stories. At that point, all science becomes then is another corrupted form of generalization. The integrity, collaboration, and respect will all vanish.
I also fear it will segregate the fields from each other even more. The journals will then have the opportunity to mold their publications to how they see fit, rather than reporting the findings within the field or fields. I believe it will be a cascade effect that will trickle down to the undergraduate who struggles to find anything that makes them happy to study. They instead study what is popular and burn out just a few years in. Without passion in a junior scientist’s heart, graduate school and a long career in research are nearly impossible. The people who have more questions than science can answer would be lost to us and progression of any industry could stall or fail all together.
There are many things I have yet to learn in both psychology and biology. If I am expected to already know or plan what I will find, how I will find it, and when I will find it, how can I be of any use? Why not just let a non-scientist do my job? How can I feel the thrill of serendipitous results? How can I explore, expand, and follow my passion when I’m being told I can only be important if I know everything already and can make my data fit that? What is the motivation to even work on a reduced internally flawed design when it will already be accepted?
Thank you for reading this. I feel that as an older, yet fresh scientist, I see things a bit differently than my peers or colleagues. Thank you for listening to my opinion and feel free to quote any of the above if desired.
I'm not entirely sure what the problem is. My best guess is that if statistics is used in generating the final result then a hypothesis is tested and, thus, this is hypothesis-driven science. Hence, what we really want to know is whether we are dealing with Type I or Type II errors in publications. E.g. have the authors "fished" for a result and presented a Type I error? Or, have they muffed the experiment systematically (e.g. motion in fMRI), or got unlucky (at the lowest practical noise level) and achieved a (legitimate) Type II error? (Motion in fMRI may be an illegitimate reason for getting a Type II error. It could be just a bad experiment! Like getting a brown sludge instead of white crystals in a chemical preparation.)
I dislike bureaucracy. In my experience, rules beget rule bending and often not a lot more. I would prefer that the culture of science evolve organically. For instance, I don't place more weight on a peer-reviewed publication just because it is peer-reviewed. I can see the benefit of having a peer review, especially if it is a good one, but there is never an exhaustive way to review a manuscript prior to its publication. Post-publication reviews could easily be more informative. Peer review and publication - at any level - are just aspects of scientific practice as currently accepted. If someone were to post a blog article describing a first-rate idea then it is no less valuable to me because it's not peer reviewed nor "published" per convention.
If Cortex is doing pre-reg then we have an experiment ongoing. I suggest we watch that experiment with interest. If others want to run experiments, too, that's great. I don't think we need a shift en masse to pre-reg to evaluate the benefits and costs.
Bad experiments are also a big problem, perhaps the underlying cause of many of the failures to support hypotheses? There is no way to know. I don't see how pre-reg will prohibit sloppy experimental practices, even if the design of the experiment passes muster. It is common in some labs for the PI and senior types to design an experiment and its hypotheses, and for underlings to actually carry it out. Does the person running the scanner understand what "minimal subject motion" really means in practical terms? How does one ensure good experimental technique, whatever that means? Does the approach of 5 pm, or a lecture, cause shortcuts? If it is your own experiment then you treat it differently - at least, you should.
Pre-acquisition peer review may be as useful/useless as pre-publication peer review. I like pre-publication reviews for some things but I also like the idea of post-publication peer review for others. Yet there is no standard way (yet) to do post-pub peer review. Is that bad? Not necessarily. I think the processes will evolve, bottom-up, and that would be my preferred course even if it seems slower and is unsatisfactory today.
Practical matters: why couple pre-reg with public availability of data? The latter issue is fraught all by itself. Why not test pre-reg all by itself. In an ideal experiment there should be a single variable. Making data available publicly, along with full description of essential experimental setup, stim scripts, etc. allows others to test for Type I and II errors post hoc; no pre-reg needed.
I on the whole agree with you. I think that preregistration in some
large data-base (not a journal) (like Pashler's - I dont know enough
about it ?? - run by the APS/EPS?) could be helpful in some areas, but
it should not be published or refereed - just available to be
consulted or quoted in a later paper. No refereeing as as geraint
points out this would double the refereeing load, which is already
In neuropsychology, however, pre-registration would be disastrous. Your reference was about single case studies, where you could not pre-register prior to running the study for a whole variety of reasons. But case series/ group studies would also have problems - you cannot be certain how many cases of a particular type the clinical process would turn up. One study I was involved in took 5 years to collect what we then pragmatically decided were enough patients (42 frontals); does one have to publish one's hypotheses 6 years in advance? And one does not normally have massive resources or the multiple centres, so as to be more confident of the eventual numbers, as for a clinical trial.
It seems to me that a more appropriate response to the problems of practice raised by the recent articles by John et al. and LeBel et al. is through this raising of awareness of suspect practices - by authors as well as reviewers (the latter for instance often ask for additional participants - see LeBel et al. for data on this).
First, the model for psychology appears to be derived from the approach taken by clinical trials. Pre-registration for clinical trials seems like a very good idea, because the space of possible tests is necessarily highly constrained by the drugs available for testing, the methods for testing are now clearly established, the trials are each large and expensive - so not easy to replicate or repeat, and there is the strong possibility for vested interests to influence publication. These factors rarely apply in the psychological sciences. It may be useful to observe how experimentation is conducted in other natural sciences for appropriate models.
Second, I have been trying to imagine how to review an experiment pre-registration and determine whether the study would be appropriate for publication in Science or Nature, PNAS, or other top specialised journals like Cognition, regardless of whether the results are null or otherwise. Generally, what makes a paper interesting is a significant result. Of course that adds to pressure to find a significant result, but if it isn't there then no matter how fascinating your hypothesis and how clear and rigorous your methods, I nearly always wouldn't find it very interesting to read. Knowing about null effects works for a highly-populated and constrained area of science, like clinical trials, but does not apply to the cornucopia of experimental psychology. As a reviewer, I now skip to the results section before commencing my review of the introduction. This is because I spent countless hours becoming enthused about a fantastic introduction and method only to then be faced by a set of null results. It may be that I am lacking in imagination at seeing the importance of knowing that for this hypothesis with this design and this participant population there was no significant effect. But I don't think so. As one instance, Newton is more famous for his hypotheses that were supported by data than for those hypotheses that weren't (he was less famous for being a tireless pursuer of alchemy).
Third, if pre-registration becomes the norm, and journals agree to publish regardless of results, I would have thought that the prestigious journals would only agree to a pre-registration provided sufficient power is available to determine with some degree of certaintly whether an effect is or is not there. For most psychological studies that would mean requiring a vast number of participants. Of course the statistical modellers would insist this is what we should do (and I agree we should in an ideal, modelling world-inspired paradise) but this would mean our Ns would run into hundreds usually, for studies that currently are published with Ns in the mid '0s. This seems to me to have the result that the rich would get substantially richer - those units that are well-resourced in time and participant payment will be able to conduct these studies. In my lab, I'd be down to one experiment every year or two (I do not mean this as an argument in favour, though my critics might use it for the opposite purpose). This point reflects Gerry's ideas about spontaneity and creativity being suppressed.
LeBel, E. P., Borsboom, D., Giner-Sorolla, R., Hasselman, F., Peters, K. R., Ratliff, K. A., & Smith, C. T. (in press). PsychDisclosure.org: Grassroot support for reforming reporting standards in psychology. Perspectives on Psychological Science. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2268189
There is much to admire about the pre-registration idea… in principle and in an ideal world. But in the real world, we often modify our methods, and our stimuli, in response to the review process and more importantly, in response to our own continued thought process. Would we have to submit an "addendum" each time a reviewer suggests something that we actually believe is worth doing? (Ok, as an Editor and author I know that we often think we're doing it just because someone has ungraciously put a hurdle in our way… but in actual fact, the science almost always benefits). Would journals practicing pre-registration not mind that we changed sufficient details in the new study that it no longer 'matched'? Or would reviewers simply no longer suggest improvements to our science because that's no longer in their remit within this new world order? Or would we simply stop the thought process between conceiving of the idea and a rough outline, and actually executing that idea?
The spontaneity of thought, and science, contributes to the creativity that we see published in almost all the major (and not so major) journals. This will stifle good science, and stifle the collective nature of our research (where, perhaps idealistically, I view reviewers as a part of that collective).
So as the Editor of a journal at which peer review is a major part of the scientific process, I have to be against such a proposal.
Ironic that this piece, published in the Guardian, was not subject to the usual review practices that scientific articles are put through. Moreover, as someone who followed the links in the Guardian report, it is misleading (and would not pass peer review!) to say "The journals Attention, Perception & Psychophysics and Perspectives on Psychological Science have launched similar projects" - I found nothing at AP&P but the Perspectives link offers a Replication Report which is a totally different thing: Pre-registering an intention to conduct a replication is a totally different thing from pre-registering a novel scientific investigation!
Most of my thoughts have already been put, but in summary:
1) There is absolutely nothing to stop a scientist from doing a study, registering the method and analyses, and then a couple of months later releasing the whole paper. A whole new type of fraud!
2) Reviews are necessarily subjective and so are editors' responses to reviews. Editors are not robots. At the moment, if someone doesn't think your results are interesting enough, or they don't agree with their theory, or they don't like your interpretation of them, they can quash your study as a reviewer, and editors rarely stop them. This is supposed to prevent that, but reviewers will still dislike results but they will just say it's the interpretation that is wrong.
3) In developmental psychology research there are just SO many ways for a study to go wrong. Generally when we know that none of the kids actually understood the task (or could even start doing it, or did anything other than scream/fall asleep/hide behind Mummy), we don't publish it or even attempt to publish it as a negative result. That would be meaningless - it would not tell us there was a null result, just that we are bad experimenters. (Though occasionally we realise that there was something interesting going on and do a completely different analysis). This would lead to some fields, in particular, having a very large number of retracted studies under this system.
If we do the original analysis, it's a null result, and salvage something additional from the results, it leads to a very cumbersome paper that gives the impression that the original analysis had a null result and something else is going on, when in fact we have no confidence in that null result at all and still wonder if the actual results could be as originally hypothesised, had we done the study properly.
4) This is probably my biggest objection and I think we could do worse than open with this - the system is portrayed as optional, but as "best practice". If something is "best practice", we are going to be seen as doing something wrong if we don't participate.
Pre-reg seems like a fine idea in theory, but you've got to ask whether it will make the science better or worse. What looks good from the perspective of large clinical trials may be counterproductive for small-scale or exploratory science. As the authors say “Study pre-registration doesn't fit all forms of science”. A word or two about why this isn’t a panacea might have been in order. They didn’t do it, so someone else ought to. If nothing else, the extra burden on scientists both as practitioners and as reviewers would be huge. Also, some of the problems raised, such as deciding to stop when things are significant can be overcome with the right use of statistics (see Krushke's book/papers here http://www.indiana.edu/~kruschke/DoingBayesianDataAnalysis/), and possibly better training.
“Critics have argued that pre-registration is overzealous and will hinder exploration”
“For instance, the registered reports initiative allows authors to report on any aspect of their data”. This is wholly inadequate as a solution.
Pre-reg is almost impossible if you are developing new paradigms where you can't make even an educated guess about power a priori. I recently ran several aborted versions of a memory experiment just to get performance within a sensible range - if ss recall everything or nothing I don't get data and I'm wasting my time, but I'm not making the decision on whether I'm getting the effects of interest. Do I pre-reg every time I change the experimental parameters? A week to run the experiment, 2 months waiting for reviewers to give me permission to change the design? This is a recipe for leaden-footed science. Pre-reg would make it harder, slower and more expensive to do good science, and there's no clear evidence that the overall quality of the science would improve.
“These outlets fear that agreeing to publish papers before seeing the data could lock them into publishing negative results or other findings conventionally regarded as "boring". This is despite the fact that clear-cut negative outcomes can be tremendously informative,”
They can be, but what proportion of null results are “tremendously informative”? We’ll all end up drowning in a sea of nothingness.
“At the same time, journals incentivise bad practice by favouring the publication of results that are considered to be positive, novel, neat and eye-catching.” Is that what the Fanelli paper says? A quick look suggests that the “Hierarchy of sciences” is a matter of power and it’s interaction with a bias against null results.
“The deeper concern of journals is that pre-registration threatens existing "prestige" hierarchies and could reduce a journal's impact factor – a metric that is arguably meaningless as an indicator of scientific quality and, in fact, predicts the rate of article retractions due to fraud.” Maybe someone can explain the force of this as an argument for pre-reg?
One partial (could be better) attempt to deal with the difficulty of publishing null results is to use something like psychfiledrawer.org where you can record both successful and unsuccessful replications. Another is for reviewers, as part of the normal review process, to check that the experiment had enough power to begin with. See:
Button, K. S., Ioannidis, J. P. a., Mokrysz, C., Nosek, B. a., Flint, J., Robinson, E. S. J., & Munafò, M. R. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(May). doi:10.1038/nrn3475
Christley, R. M. (2010). Power and Error: Increased Risk of False Positive Results in Underpowered Studies. The Open Epidemiology Journal, 3(1), 16–19. doi:10.2174/1874297101003010016
Ingre, M. (2013). Why small low-powered studies are worse than large high-powered studies and how to protect against “trivial” findings in research: Comment on Friston (2012). NeuroImage, 2012–2014. doi:10.1016/j.neuroimage.2013.03.030
Another is to get rid of journals devoted to the publication of sexy results: Bertamini, M., & Munafo, M. R. (2012). Bite-Size Science and Its Undesired Side Effects. Perspectives on Psychological Science, 7(1), 67–71. doi:10.1177/1745691611429353
But, ultimately, Type I error is just something science has to live with.
“If the scientific question and methods are deemed sound, the authors are then offered "in-principle acceptance" of their article, which virtually guarantees publication regardless of how the results turn out.”
Especially if you're developing new approaches, the real problem in a paper is often explaining why the data you've collected really do answer an interesting scientific question. Can we afford to make scientists and reviewers (are they the same?) invest such a huge effort before collecting data that could turn out to be meaningless? It’s bad enough having to argue with editors and reviewers when you know you’ve got solid results under your belt. Who has the right to stop you doing science that your peers might think is pointless?
Imagine that I'm trying to design a study to examine X, which is a hot topic. In a moment of weakness I agree to review a pre-reg proposal and it turns out to be pretty similar to something I'm currently considering doing. That poses a real dilemma. I can't just turn it down because of conflict of interest as I had to read it do discover that. Can I still do my very similar study? Hey, maybe I should just send in a few hundred pre-reg proposals to cover just about anything I can think of?
Pre-reg might fix the HARKing problem though: Kerr NL (1998) HARKing: Hypothesizing after the results are known.
Personality and Social Psychology Review 2: 196–217.
On the way to work I began to think that the strategy might be to begin with their acknowledgement that prereg might not be best for all science and then say that rather than just dismissing all those who disagree as self-serving we should ask when it should be used. When do the costs outweigh the benefits?
Clinical trials: Big investment, high cost of suppressing -ve outcomes. The cost of prereg is relatively small.
Small scale science: For all the reasons we've mentioned, the cost of prereg is high, but where are the benefits? The process of science is ultimately self-correcting - we'll get there in the end, but will we get there faster with prereg? Highly unlikely.
Exploratory science: What would be the point? Collect data, analyse it any way you see fit.
They are really conflating two issues here - public trust in science, and the practice of science.
Sad to say, but the public neither hears about nor cares about small scale science, and by the time it feeds into large scale science not only will many of the problems be ironed out, but then it will be subject to more stringent control such as prereg.
Overall this is a recipe for slow and more costly science.
As others have noted in replies to you, pre-registration of all experimentation with journals would simply grind science to a halt. I quite like the idea of forcing Nature to publish all my undergrad projects, but I doubt they will.
The problem is more with blind reliance on null-hypothesis-Significance testing. Reduce use of that and we would live in a better world.
If we seriously want to tackle the file drawer and falsification of data problems, a form of 'lab book' recording would be better, where we rely on each researcher to record who they have collected data from, in which 'experiment'. No-one need approve or accept an experiment, and the data needn't even be there. It has always bothered me that psychologists do not have to do what biologists, chemists and other lab-based scientists have to do (or maybe they don't now either). Of course lab books can be falsified, but if they were maintained online in a single system (BPS and APA maybe?) it would be harder.
Anyone wanting to know if, for example, anyone else had replicated Experiment 4 from May et al (2010), they could search for that study in the system and get a list of all experiments created mentioning that as a precursor. A journal reviewer wanting to find out how many times I had had to run that study as an undergrad project before it 'worked' could check out my online log book. And think of the case for impact! 'My classic study on doodling has been unsuccessfully replicated 21,942 times...'
The way I see it, pre-registration is attempting to solve the p-value side of science. Currently, the incentives in academic publishing bias one to not publish null-results and to cherry pick results from a larger array of tests. This is problematic because it obscures the landscape of "significance": null-results will be underreported and false positives are more likely to be reported.
While I think that pre-registration would in theory help bring transparency to the interpretation of p-values, I'm skeptical that it would actually help for a number of reasons.
First, if the current system creates incentives to withhold or selectively report results, we need to examine what incentives pre-registration would create. It seems to me that --in the extreme-- it would create incentives for researchers to run experiments, obtain results, and then retroactively submit a pre-registration for the experiments. The results would appear to be even more reliable than in the current system since they ostensibly would have been reported either way. Even if one didn't go to that extreme, how much piloting would be acceptable before pre-registration? How much can one explore before deciding what an experimental question would be?
Second, I whole-heartedly agree with what other people have said which is that it will over-burden the review system. Submitting something for review would be much easier with pre-registration: one simply needs to write up what one would like to do rather than what one has actually done. The first thing a new researcher would do is submit as many pre-registrations in their first year as possible. This would clearly generate substantially more items to review than under the current system. And what would the review process look like? Since one is evaluating the proposed experimental design, would you send a proposal back if you had a problem with the stimuli (revise and resubmit)? How involved in the design of an experiment would a reviewer need to be?
Lastly, I'm concerned that pre-registration would trade one sort of clarity for another. Assuming that people don't abuse the system in the way I described above, I think we would in fact have a better handle on which results are robust in our field since we would know which results are replicable, etc. But, it would come at the cost of theoretical clarity of a different sort. In the best scenario right now, researchers do their best to develop accurate theories by testing theories and interpreting their data. A researcher would follow up on her significant results to ensure that they are not spurious, and would develop an interpretation that would integrate with existing theories, etc. My concern is that pre-registration would shift the emphasis away from theory building and towards data collection since having a coherent interpretation of the data wouldn't be required for publishing. I may be testing theories A and B, get a result that doesn't fit with either, but since I preregistered, I could be done, simply publishing the results as is. It's true that perhaps I could feel confident that the p-values accurately reflect the probability that the results were obtained by chance, but I may not have anything intelligent to say anything about them. My feelings are that under the current system, one would need to follow up on those results, do more testing to try to understand them since publication is based (in large part) on the story you tell. This may create, as Chambers et al. say, a bias to be "novel, neat and eye-catching" but I worry that pre-registration might lead to a bias towards "these are the data, who knows if they're interpretable, there you go".
I agree with what others in this thread have said, which is that it may be better to retain the current format but encourage null-result reporting. Perhaps journals could have a section where one reports pilot/supplementary studies that were conducted as part of the investigation but that didn't make it into the paper. Perhaps included as an online supplement, this could allow researchers to package data and interpretations as we have been doing for so long, while at the same time providing experimental context. This context should be expected by the research culture --most research doesn't come out of thin air, so the expectation would be that most articles would have accompanying studies, etc. I believe that these sorts of systems would help clarify what work has been done surrounding a result while retaining the incentives towards theory building.
May I suggest that it might be most useful to propose a composed counter letter to the guardian which all the people in the email list could
supply a signature for, covering the main points of disagreement? I obviously think this an important subject and am appalled that
they didn't immediately ask you to write one to include as a counterpart to their current piece.
I worry about the potential emergence of 'high priests' of 'proper science' who could quite easily come to have unhealthy control over how science should be done and who should be allowed to do it, whilst actually not necessarily preventing fraud. Were the proposed motion to occur, would we get a substantial curb on scientific creativity and originality?
Another worrying element here is how this issue is raised in the media first, rather than being adequately debated within the scientific community before media pronouncements are made. Media has unsurprisingly jumped on this, as it makes a juicy story to allude that if scientists do not sign up for this motion, it is likely to be because they have a bunch of dodgy methods to hide. When you add to this that not every scientist holds equal sway in the media, this way of conducting the debate seems doubly unhelpful.
I forwarded to Ben… he made the good point that student projects (of which I have published several) would essentially be unpublishable, because you would never preregister all student projects, not least because of a) numbers and b) half the challenge for the students is working out exactly what studies they want to run and why. I can see the objection: that you have to set things in stone at some point, not least for ethics, but then it all comes down again to the relative timings.
I agree that the peer review system, and indeed the whole way we report science, needs periodic updating. This is particularly true now that science has become a massive enterprise, far removed from its profile when peer reviewed journals were originally set up. As part of the expansion of science (and higher education more generally), I do think there has been an increase in science "careerism" - the attraction of people to science who are more concerned with advancing their careers than with advancing science. As with any other highly competitive sector, this inevitably leads to "gaming" of the system, in the ways it always has in other fields. e.g. cosy relationships between editors, reviewers and authors, misrepresenting of methods and results, and yes, even outright fraud. Coupled to this are newer ways to gain advantage, such as extensive PR efforts on social media to big-up certain results (or put down others).
So, yes, I think there is a need to reassess our whole publication system. Pre-registration might be one way of fixing some of the problems. But as you say, Sophie, the pre-registration folks seem to be of the opinion that there is one correct way to do science (with a partial concession to a second, more exploratory mode), when the reality is not that clear-cut. And it never has been. So I am somewhat reticent to support what seems to be an overly rigid interpretation of how science should be done.
The thing is, I think most of the problems that pre-registration seems to be addressing can be summed up by "clarity and honesty" in reporting science. Indeed, the whole basis of pushing for pre-registration seems to be that scientists cannot be trusted. If that's the case (and I do think it is the case quite a bit of the time), then I don't think bringing in a bunch of new rules is going to fix things. If people are being dishonest now, they will only find new ways of being dishonest in the future.
But I don't think the primary problem is, in fact, dishonesty. And I don't think the solution is imposing an even more rigid set of "rules" on how to do, and report science. Rather, I think many of the current problems stem from an overly rigid set of publication "rules" (written or unwritten) that are applied by editors and reviewers during the publication process [Some of that perhaps originates in the way science is taught in Psychology - as almost exclusively a matter of hypothesis testing]. In any case, my contention is that because editors and reviewers insist on authors "following the rules" by presenting their study as a test of an a priori hypothesis, as opposed to a serendipitous finding, or the result of a careful exploration, or the re-interpretation of results in light of new evidence or new thinking, they put immense pressure on academics to change their story to fit the rules. That's the bigger problem in my opinion.
So for me, the solution is almost opposite, in some ways, to the pre-registration proposal. I would like to see more editors of journals show support for the different types of scientific research, as opposed to the almost exclusive focus on hypothesis testing (and null hypothesis testing at that).
In addition, there is a need to educate students, postdocs and academics much more clearly and comprehensively about what is acceptable and not acceptable in science. Many of the problems mentioned in the calls for reform are probably the result of ignorance rather than dishonesty. For example, adding more participants when your study narrowly misses significance. I'm not convinced that the majority of scientists understands why this is a bad thing when using standard null hypothesis testing. Nor am I convinced that the majority of scientists understand that it's totally legitimate when using other types of statistical tests (such as Bayesian Stats).
- to address multiple comparisons problem, there has long since been a massive culture shift among people who do Genome Wide Association Studies - according to Wikipedia, it is standard practice to do a first analysis in a discovery cohort, and a then validate the significant SNPs in an independent validation cohort. A GWAS cannot get published these days in any decent journal without this step. That came from within the field, which determined, through its own experience, that such a policy was necessary.
It should be left to specific fields to develop their own innovations
and checks to improve the quality
of specific methods in their field. If the population geneticists want
something, their field's journals will
require it, and they as individuals will demand it when reviewing for
broader interest journals.
1. The very real threat of IP theft is likely to get worse under a pre-registration system. Currently the time-lag between taking an idea and collecting the data to support it is our only protection from IP theft of this kind.
2. A lot of my research is funded by who require sight of publications prior to submission, reasonably enough. Pre-registration simply won't work in such situations, because they will not permit publication of a promissory note.
3. Pre-registration assumes that we all run controlled experiments with neat factorial designs. Most of my work begins as ethnographic studies that capture large datasets which are then analysed qualitatively. They may end up as a quantitative test of a hypothesis, but they don't start this way. A pre-registration process could not work for these kinds of studies, and would only enhance the anti-applied, and anti-qualitative research biases that some of us think are already inhibiting our ability to publish.
Greig de Zubicaray
The prereg proposal is a classic example of micromanaging, and last I looked this was almost universally viewed as counterproductive - except by the micromanagers.
Re specific examples of problems with prereg: We can all agree we want the failed replications or falsifications published with priority so we can get on with investigating real stuff… it doesn't make sense to give equal priority to unlimited replication attempts. If preregistration means that 'yet-another-replication-of-an-established-effect' (YAROAEE for short) gets equal priority, then we needn't try anything new as our publication track records are assured.
As a supposed 'solution' to the problem of failure to report null results, the pre-registration approach brings up many more practical problems. But my fundamental issue with the approach is that I don't agree that all that matters in science is the question. Some results are also more interesting than other results! And not all experiments are carried out, analysed and reported equally well and that matters too. So questions are interesting, but so are results!
More broadly, the question of reviewer overload is one that appears to be rarely considered in many of these semi-utopian approaches. Pre-registration sets up a strong incentive to submit as many ideas/experiments as possible to as many high impact factor journals as possible. Indeed, it would make sense to pre-register my ideas with Nature and IFF pre-accepted, then write a grant to conduct that experiment. After all, I can then show not only preliminary data but a guaranteed Nature paper to the appropriate funding panel. This approach has the potential to significantly increase reviewer load because I'm sure most scientists can generate far more ideas than actual experiments. The system is already creaking at the moment and I think the practicalities have barely been thought of for such a huge increase in workload.
But my primary objection is conceptual - it's not the right way to do science. Encouraging publication of null results and replications would be a better way to go.
1) I worry that, erroroneously, people with think that a priori results are more "truthful" than post-hoc. This is not the case if the statistics are done correctly.
2) The reason it could be important is that there is a suspicion that scientists currently pretend ideas are a priori when in fact they are post-hoc. This is a problem that does need some form of solution. The current solution is that we let others try and replicate and non-replicable results disappear. I have no problem with this solution and it is the best we have and I think better than pre-reg.
3) What I think I do not like is that it is saying that we should not trust each other as scientists. I find this quite depressing, even if there is some truth in it. I also feel that this is the thin end of the wedge - why trust people have collected the data! I think we need better education and more trust.
4) Practically I have no idea how it can work. Who judges the pre-reg? They would have to have some many levels of expertise in each field to know that all the degrees of freedom of any analysis have been pre-registered. A good example of a problem is number of subjects for fMRI. Some would say we need >20 subjects perhaps more other, like Karl, would argue that 16 is more than sufficient and indeed one may start reporting false negatives if the number increases. How will the power of an fMRI study be assessed in pre-reg. Who polices the pre-reg. Will they be anonymous or not?
5) Also one can think of examples where it will fail. I pre-reg to run 20 subjects. Do I have to pre-reg all exclusion criteria in advance? I guess I would have to for pre-reg to be valid. Then one subject has different behaviour but has a clear reason to be excluded but one I did not specify. What do I do? If I run more subjects then it will look suspicious as I have not stuck to pre-reg. But if I do not exclude the subject then significant results will be reported as null.
6) I dislike the fact that if you decide not to publish the 'paper' is published as a retraction. This is a very loaded word.
7) I worry that it will be used only for people who want to find non-replications of studies that they do not like. I confess I have already thought of using it like this! Pre-reg a replication study. Fail to replicate fairly easy to do if you want to. Kill a study in the field because your study has more "truth".
One obvious finding that would never have been found with pre reg is mirror neurons. Having noticed the discharge initially by chance as the story goes, under pre reg they would have had to down tools write a pre reg paper wait for responses and only then continue recording.
I would also say that if this is a cultural problem (and I'm not sure it is - scientists are people and there will always be cheats and corner cutters) it needs a cultural solution not an administrative one. The pre registration position seems to predicated on a limited view of how science proceeds.
As you note, the pre registration system would simply move the goalposts: who would fail to write a good proposal of what one is GOING to do? What would be the motivation of reviewers to raise a bar of any kind? How would important scientific factors such as elegance, insight, the unexpected, the weight of the question (this is a judgement call that is really the most important in the review process) and the quality of analysis and interpretation be assessed before the experiment? And what about how experiments are really done? Day 1 "we have a great idea, we've done the pre reg and...", Day 2 "holy shit, I didn't expect that, I'm going to pursue it..." Do we do a pre reg every time we have an idea? We'd spend all our lives preregisteting ideas.
The idea, worst of all, shows a deep naivety about the philosophy of science. It is almost our job to be wrong...but we have a duty to interestingly wrong. The best most of us can hope for is to be interestingly wrong. Worse (I know I said the last thing was the worst)
, the preregistration idea seems to weigh all experiments equally. One reason I am not too excited about the recent frauds (apart from the fact that I simply find them sociologically and psychologically fascinating) is that many of them occur in fields where the stakes are low. Not all experiments are carried out, analyzed and reported equally well (even though I am sure they will all be preregistered equally well). We can only assess this quality after the fact.
The prereg idea also risks sounding a little patronizing. As you point out, you write grants and make project presentations - you have filters among your peers. I'm not sure why a poorly motivated reviewer would be a vehicle for improvement over and above this. The review system is already creaking. Really creaking. If anyone asks me to review a pre reg I will have to say no. I have my plate full with reviews of experiments that have already been done.
We haven’t got a special problem. I have done a lot of research on fraud (another book I never wrote). Physics and medicine are full of it. But it will out. Jan Hendrik Schön fooled nobel prize winning physicists with his 8 Science, 7 Nature and 6 Physical Review papers on semi conductors (all retracted). Does anyone seriously think that pregistering his experiments would have led to averting the fraud? Nobel Prize winners in the worlds two top journals and a top physics journal couldn't stop it, so what chance would a no mark journal have when we are already busting at the seams with review work?
It's hard to think of a comparison to illustrate the problem of the pre reg idea, but maybe we can think about it as similar to asking people whether they have any intentions committing a shop lifting crime in the near future. My guess is that not only would most people say "no", but some of the biggest thieves would be the most moralistic.
Hubel and Weisel's orientation sensitive neurons would be top of mine.
Observational studies of neurospych patients would never get off the ground.
It's even ridiculous to suggest that after running an experiment for a year, one has the same hypothesis as when one began the experiment - the reason I think is in order to change my mind.
My main concerns are:
- I know they are only proposing that some journals work like this but if this model were to spread, the journals, **private companies** would have even more say over what science was conducted. This is so appalling that it is not worth thinking about.
- some fairly major discoveries have emerged from exploratory, non-hypothesis driven work and from serendipity. I agree with you and James Kilner that there is nothing wrong with pos hoc interpretations as long as it is good science.
- I agree that we need to find a way to publish important null results but just like positive findings, not all null results are equally interesting. This approach would just bring random null results into the arena which doesn't solve the problem.
The pre-reg model comes from RCTs in medicine. There, pre-reg has seemed appropriate, even necessary, largely because of cultural factors. For example, it ensures that knowledge with wide societal benefit cannot be suppressed by commercial interests, and it ensures protection of the public by insisting that medical interventions be rigorously and correctly tested. Pre-reg may also have a role in some other fields, such as psychology, because it could partly address traditional problems such as low power, publication bias, and poor methodology.
But a large part of scientific activity is rather different from RCTs. Much of science uses different processes from RCTs. Much of science takes place in a different cultural environment from RCTs. Much of science has different aims from RCTs. In particular, we must remember the creative and generative aspects of science, and we must acknowledge their importance and their role. The current cultural climate rightly emphasises strong integration between scientific innovation and business. I suspect our colleagues in business would be absolutely aghast if we voluntarily threw away our system for fertilising and incubating ideas. For example, RCTs are often the end-result of a process of developing a new intervention, such as a drug or a therapy. The first steps of development may look very different from RCT, but they are nonetheless essential for progress.
Pre-reg may help to provide rigorous answers to questions of important societal concern, like "Does intervention X work?". But, as scientists, we also have to think outside the box. We need ways of working which allow interventions Y and Z, which may not yet have even been thought of, to see the light of day.
While trust in science might be improved by pre-reg of some studies, the overall ability scientists to contribute to humanity would be reduced if pre-reg became a prerequisite of valid scientific work.
is this something we should be discussing at conferences / meetings as well as via the newspapers? Maybe downstream the is a symposium to be had (at EPS for example) where this could be discussed as well?
I was very interested to read your response to Chris' statement. I have followed these discussions about the state of science fairly closely in the past year or so and while some proposals may be worthwhile (and the people making them certainly have their hearts in the right place) I agree with you that these revolutionary measures may be quite short-sighted and have potential to cause far more damage than what they fix. I am not opposed to trying to improve matters but I think this has to be discussed and thought through very carefully and I'm not convinced that people are doing so.
A few months ago I tried to engage with Chris about this on his blog a bit but that was mainly playing devil's advocate. It hadn't resulted in me being very convinced either way. I will think if I can give you some more thoughts to consider for your response. I certainly agree with your last point most: pre-registration and assured publication will actual remove the incentive to doing excellent science and seeking a greater understanding of your data.
I very much support your views. I am sorry to hear that your letter was not published in the Guardian.
Please count me in!
My reasons for signing [the original pre-reg letter] were first that I think a pre-reg option should definitely be allowed by PLoS ONE (which is the only editorial board I am on), seeing as it clearly fits with their remit to publish any technically-sound research drawing appropriate conclusions. Second I think that there deserve to be more outlets welcoming pre-reg submissions if only as an empirical test to see how it works out as a publication model. I have some sympathy with Chris' position that by reducing the flexibility allowed in pre-registered manuscripts this would disadvantage scientists taking this route compared with the traditional approach. But I think it's clearly absurd to demand that all journals follow this model.
As you point out, anyone who has read any 20th century philosophy of science will know that the Popperian description of the scentific method set out in the first paragraph of Chris' piece is totally unrepresentative of science as it is actually conducted, and it's not necessarily desirable. I think it's particularly worrying in a young field like ours to take this approach, where many of the background assumptions, especially methodological ones, will turn out to be false. My own take is that this worry needs to be balanced against the familiar arguments on the other side (file drawer etc.) and it's worth pursuing pre-reg as an alternative model for at least a small proportion of outlets, and see where it takes us.
In terms of your question about specific examples of problems of pre-registration, there is obviously the issue of fortuitous discoveries (like Hubel and Wiesel accidentally discovering edge-sensitive cells when they loaded slides into their projector, if I remember right). I think a bigger problem, related to your point about incentivizing false negatives, is that it encourages scientific progress as an accumulation of disconnected small observations, and discourages cognitive theorising. Of course, cognitive theories often come about when we try to reconcile the results we've obtained with what we were expecting. There's less incentive to do this if the paper has already been accepted before we've collected any data. So I think that moving to a substantially pre-reg system might lead to a decognitivising of cognitive neuroscience with a theory-free accumulation of small observations, like the field saw in the early days of functional imaging. I think the big problems facing cognitive neuroscience today are probably ones of overarching cognitive theory, rather than a "box-ticking" filling in of small details, so moving to a substantially pre-reg system might be a retrograde step. Or it might not! Looking forward to hearing more about the responses you receive, and do let me know if I can help with anything.
I have mixed opinions about pre-registering studies in general but I don't like the proposal from Cortex at all. Being locked into a journal before you conduct a study is just not good, for several reasons which you list nicely. First of all, I'm not going to let any reviewer stop me from running a study that is important to me. All they'd be doing is stopping me from publishing that study in that particular journal -- and they do that already, but based on far more information (namely the whole paper, data and all). Second, I'm not going to be locked into a particular journal -- as you say, I'll write it up depending on the results, who they might be most interesting for, etc. FInally, the issue of incentivising bad analyses is a very real one. If you're a young researchers and you get your good idea study pre-accepted based on the question and design, then it's just more efficient to do a quick, sloppy analysis and damn the results -- after all, who cares? The paper was already accepted. Time to move on to the next one.
To my mind, those are all problems with Cortex's pre-reg idea. I have less of an issue with a generic pre-reg database where, if you got negative results (that you believed in), and wanted to write them up, you could point the journal editor to your registration thingy to show you went through the process legitimately and maybe that you're not the only one. To my mind, that doesn't mean changing the journals much at all - it just means individual editors have the opportunity to be more flexible with publishing null results, and I think that's generally a good option to have.
Finally, I am worried about the ying and yang of pre-reg vs. exploratory analyses. I often see younger researchers (or theoreticians) claim that if you asked the question right and know the "correct" analysis a priori, then you just collect the data, apply the analysis and voila -- there's your answer. That's total rubbish and anyone with sufficient experience collection real-world data knows that. How many times have you looked at someone's (a student's, for instance) analysis of a data set and said, "Nope, something is wrong there." You might not know what initially, but you've seen too much data to believe it. And then that forces you to go back and look in detail at the data and low and behold, you find something. Even when you don't, the process of looking oh so carefully leads to a substantially better understanding of the data. In other words, just applying a data analysis algorithm is the start of the analysis, not the end. But for people without such experience, it can look as if the researcher is trolling simply because the findings don't fit their (pre-conceived) ideas. Experience is too valuable to be ignored and this can't be confused with bad practice -- in fact, it is most emphatically not.
Benjamin de Haas
- I share the concern about the naive authoritarian ethos reflected in the idea - as if individual scientists were prone to failure but a control system was less. Yes, scientists are prone to fads, funders' tastes and the predominant ideology of the day. But how exactly would that get better by hedging your bets on even fewer people (i.e. editors)?
- Part of the argument for pre-registration seems to go as follows:
a) (High impact) editors choose papers based on whether they contain interesting results.
b) Scientists can do a good job but obtain boring results.
c) Scientists are hired based on the number of (high impact) papers they publish.
d) This is unfair and invites people to do anything from massaging data to outright
fraud. Therefore we need to change a). Editors should not choose papers based on their
This seems to be saying editors have the job of judging researcher quality, not interestingness of findings. I have no clue whether b) is a valid concern on average and in the long run (being early career I hope not!). But even if it was, why not change c) then? Hiring committees have the job to evaluate scientists, editors do not.
Also, the fact that editors function as a filter for interesting *findings* is very valuable in my opinion. High impact journals contain findings that are interesting to a broad audience. That's why more people have e-tocs for Nature than for Journal of Highly Specialised & Boring Crap (JHSBC). Pre-registration would get rid of this altogether. Some would claim we don't need this filter any longer because search engines, peer ratings and commentaries would do the job. I'd say the burden of proof is on them. So far this doesn't seem to work (the median number of comments for PLoS One papers is zero).
- Finally, pre-registration could result in more and thinner papers, not less of them. It would encourage publishing salami-style. We currently do an fMRI study of a perceptual effect we found in a psychophysics experiment. The plan is to publish the psychophysics and the scanning results together. In a pre-reg. scenario we would never have done this. We didn't know whether the hypothesised effect exists in the first place, so the (costly) fMRI was always conditional on the (less costly) psychophysics. With pre-registration we would have had to register the psychophysics in isolation and then (depending on the results) register a second study for the scanning. This would have slowed the whole thing down massively and it would have resulted in twice the number of papers for the same amount of results.
I think the idea of pre-reg on a large scale would be problematic except for clinical-trial-type studies, for which I think it makes sense. Most research, though, is much more exploratory and the best work had a degree of serendipity. How one would deal with new ideas based on experience in collecting data or having a surprising finding could lead to much more important experiments that would deviate from any pre-reg -- so far they only offer the possibility of doing unexpected data analysis. What if a more interesting follow-up experiment comes up -- wait a few months for new approval to run? How would this impact a 3-year PhD given likely turnaround times, especially if the first pre-reg gets rejected as not being "important" enough, and one has to shop around for another journal? Also, you point that it should be studied and empirically assessed first makes sense.
(The authors have produced other interesting analyses though :) such as this forthcoming paper: http://www.frontiersin.org/Human_Neuroscience/10.3389/fnhum.2013.00291/abstract that shows how Curr Biol raised its IF by reducing the number of articles it published post-hoc (marked as a Suppl Fig 1).... and also the strong relationship between IF and number of retractions: Fig 1D. I've met Chris and Marcus and find them well-meaning, however I think they are trying to go to broad with a solution to something that might not be a problem -- I think there are other greater issues for the field that pre-reg would not address.)
You make many cogent arguments about points the
pre-registration proponents haven't considered in dept.
Also, the Guardian post says "Study pre-registration
doesn't fit all forms of science, and it isn't a cure-all for
scientific publishing." --- Which forms of science does it fit, and
which doesn't it fit? They don't specify, as you noted. Because of
rapid technical advances, is neuroimaging excluded? This would seem to
be one of the worse data massaging offenders, yet if you prevent
people from changing acquisition protocols or adapting to improved
analytic methods, you're not promoting the best possible science. What
will be the average time from submission to acceptance to starting
(and finishing) the experiment?
Another of your examples, neuropsychological case studies, is
particularly difficult. Are you not supposed to test the rare
individual with hemi-prosopagnosia or a unique form of synesthesia?
Many aging and developmental studies could be problematic too. What if
your elderly group is no better than chance in a memory test that
undergrads could do at 80% accuracy? Maybe your small pilot sample of
elderly were very high performers and not representative? Obviously,
being locked into publishing such a study would set you back the time
it would take to make the task easier and re-run the experiment. You
could even say in the new paper that you ran the experiment with 500
items in the study list and the elderly were no better than chance.
Who's to say that a reviewer would have caught that error in advance?
I think there's a distinct lack of risk-taking among the strongest
proponents. What's to prevent them from pre-registering studies in
public databases or on their own blogs (without formal peer review but
perhaps soliciting comments)? Jona Sassenhagen has already done this,
why haven't more followed suit?
BMC Study Protocols has been around for a while too:
I'm not at all opposed to pre-registration, and I think it'll be an
interesting experiment to see whether research practices improve and
"scientific quality," or replicability, increases. But I can see the
danger in that being viewed as "saintly" research with the rest of it
tainted. I think it's important that you've opened up this debate.
I'm concerned about the prescriptive, apparently constraining nature of some of the proposals. I certainly don't support mandatory pre-registration, as it simply isn't appropriate for much psychology research, including much that I do - for the reasons you've described. I also strongly agree with you that it runs the risk of merely reinforcing the (in my view, incorrect) perception that psychology is fundamentally flawed and, indeed, the equally (in my view) problematic perspective that there can ever be a "true" study. I wrote a little about this last year:
To paraphrase what I wrote then and build on it with regard to pre-reg, I can see how for some kinds of science (e.g., intervention studies), the "truthiness" of an individual study (whatever that might mean - it's often confused with control of Type I error only) might be of paramount importance, and pre-reg might be appropriate in the way that it is for clinical trials. But for the other kind of science, which for psychology and cognitive neuroscience may well represent the majority, in which people are concerned with eg exploring the contribution of different brain regions to a particular cognitive ability, no single study is considered definitive or "the answer". Instead each provides a clue or piece of the jigsaw that contributes to progressively advancing knowledge about the problem. I take each study that I read with a certain pinch of salt, the size of that pinch determined by a whole number of factors that relate to the quality of the research. Those factors might include things like reputation of the group doing the research, but often more important are issues about how well the research has been done, like the ingeniousness of the paradigm, the size of the observed effects, whether appropriate corrections have been used, the care with which the authors consider alternative possible interpretations, etc etc. None of these is mandatory (e.g., unlike some, I'm quite happy not to condemn out of hand uncorrected fMRI stats thresholds), but the more of boxes like these that are ticked, the smaller the pinch of salt I might take. A difficulty with this, I suppose, is that these "quality heuristics" are subjective, largely individual to each reader, and depend quite a bit on how much experience someone has of the research area.
I'm not a fan of generic, so-called "standards" of reporting which often seem mainly to constrain the scientific process. As I see it, forcing any more mandatory constraints on psychology/cognitive neuroscience than already exist is undesirable. However, I can see possible benefits to the introduction of *the option* to pre-register because for some studies it might constitute one additional possible means of strengthening a paper; an addition to my collection of heuristics for things I might look for in determining the size of my pinch of salt. In practice, I suspect it would rank much lower on my heuristics list than other quality factors. For what it's worth, I don't think even Chris Chambers advocates *mandatory* pre-registration, although I suppose there's a "thin end of the wedge" argument that optional could end up leading to mandatory.
Speaking out >