On the Benefits of Adaptive Designs and Sequential Analyses for Psychological Science

Post date: Oct 2, 2013 5:46:26 AM

You would like to analyze your data while data collection is in progress? Be my guest. You’d like to continue data collection if an effect is not statistically significant? I see no problem there. You’d even like to drop a condition you say? Well, why not? As long as you do it right.

Psychologists have been warned about the effects that collecting additional observations after looking at your data can have on Type 1 error rates, or the increased likelihood of false positives. Note that I put the word can in the last sentence. Meaning, flexibility doesn’t have to increase Type 1 error rates, as long as you keep them under control. Sounds straightforward, right?

You might be thinking: Sure, but how? Other disciplines had the same question, a long long time ago. In clinical trials in medical science, they believe it is unethical not to look at the data while data collection is in progress. You know, in case people die, and stuff. Therefore, they created statistical techniques known as sequential analyses.

These statistical procedures are widely used in medical research, but rarely, if ever, used in psychology. The procedure is rather simple – it involves collecting a specific sample size, and then analyzing the data during interim analyses. After an interim analysis, you can do a range of things. First, you can stop data collection if the evidence for your hypothesis is convincing. Early during the data collection, there is a lot of variation in the data, and thus the alpha level in sequential analyses is set pretty high (e.g., .001 - because really, what's the use of publishing a study with a small sample size at the normal p < .05 level? We would know nothing, John Snow!). At the end, it is possible to test your prediction at an alpha level very very close to .05. At an interim analysis, you can also look at the effect size, calculate conditional power, and judge whether it makes sense to continue, or whether continuing data collection is futile, based on the data observed so far. Sounds more efficient than guestimating a sample size based on an a-priori power analysis, and not looking at the data until all 178 participants are collected, no?

Even more useful is to use an adaptive design, where you increase the planned sample size based on the observed effect size in an internal pilot study. Yes, that’s right, an internal pilot study is just like a normal pilot study, except you don’t have to throw away the data when you are done. This is useful when you don’t have an accurate effect size estimate (which in psychology, is practically always). For example, if you perform a replication study, you might be uncertain whether an effect even exists. So how many participants should you collect? Simonsohn (2013) suggests collecting 2.5 * n participants, but this could still yield an inconclusive replication study where the effect is not significant, but could be reliably greater than zero. In adaptive designs, replications can always be conclusive, because data collection can be continued until a statistically significant result has been observed, the effect size estimate is reliably lower than a minimum value deemed important, or when the 95% confidence interval around the effect size has a desired width but still includes zero.

I believe these methods might be of interest to psychological researchers who face the practical challenge of designing and running studies as efficiently as possible, while controlling Type 1 errors and guaranteeing a desired level of statistical power. I wrote a short article which is under review (you can find it here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2333729) explaining the basics, based on literature from medical sciences. In this article, I also briefly discuss the need for pre-registration, ways to prevent experimenter bias, and I compare sequential analyses within a NHST framework with Bayesian approaches. I hope this brief introduction will lead to a discussion about the potential of such procedures for psychological research.

If you have comments or suggestions, please contact me on Twitter (@Lakens) or send me an e-mail (D.Lakens@tue.nl).

Google Sites

Report abuse