Preface

Excerpt from Chapter 3. Mistake: Not recognizing publication bias

By the time you read a paper, a great deal of selection has occurred. When experiments are successful, scientists tend to continue the project, while less successful projects get abandoned. When the project is done, scientists are more likely to write up projects that lead to remarkable results or keep analyzing the data in various ways to extract a statistically significant conclusion. Finally, journals are more likely to publish “positive” studies. If the null hypothesis were true, you would expect a statistically significant result in 5% of experiments.

But those 5% are more likely to get published than the other 95%. This is called publication bias. If many studies were performed, you might expect some studies to find larger effects, and some studies will to smaller effects, and for the average effect size to be close to the truth. However, studies with small effects tend not to get published. On average, therefore, the studies that do get published tend to report effect sizes that overestimate the true effect (Ioannidis, 2008).

Excerpt from Chapter 20. Normality tests ask the wrong question

Because almost no variables you measure follow an ideal Gaussian distribution, why use tests that rely on the Gaussian assumption? Plenty of studies with simulated data have shown that the statistical tests based on the Gaussian distribution are useful when data are sampled from a population with a distribution that only approximates a Gaussian distribution. These tests are fairly robust to violations of the Gaussian assumption, especially if the sample sizes are large and equal.

When analyzing data, the question that matters is not whether the data were sampled from an ideal Gaussian population but whether the distribution from which they were sampled is close enough to the Gaussian ideal that the results of the statistical tests are still useful. Normality tests do not answer this question.

Excerpt from Chapter 26 (Review)

Statistical inference helps you make general conclusions from limited data, so conclusions are always presented in terms of probability

Be wary if you ever encounter statistical conclusions that seem 100% definitive.

All statistical tests are based on assumptions

Review the list of assumptions before interpreting any statistical results.

Otherwise the investigators may be P-hacking.

Statistics is only part of interpreting data

Also think about study design and experimental methods.

Many statistical terms are also ordinary words

Don’t mistakenly give a statistical term an ordinary meaning.

The standard error of the mean does not quantify variability

The standard deviation and standard error of the mean are often confused.

Confidence intervals quantify precision

All values (means, difference, ratio, etc.) computed from data should be reported with a confidence interval.

Every P value tests a null hypothesis

You cannot understand a P value until you can precisely state the corresponding null hypothesis.

The concept of statistical significance is designed to help you make a decision based on one result

If you don’t plan to use this one result to make a crisp decision, the concept of statistical significance is not necessary.

“Statistically significant” does not mean the effect is large or scientifically important

It only means a difference (or association, or correlation) this large as or larger will happen less than 5% of the time (or some other stated value) by chance alone.

“Not significantly different” does not mean the effect is absent, small, or scientifically irrelevant

All you can conclude that the observed results are not inconsistent with the null hypothesis.

The term significant has two meanings, so is often misunderstood

Avoid the term when possible.

Multiple comparisons make it hard to interpret statistical results

To correctly interpret statistical analyses, all analyses must be planned before collecting data, and all planned analyses must be conducted and reported.