Preface

1. Statistics and Probability Are Not Intuitive 1 

We Tend to Jump to Conclusions

We Tend to be Overconfident

We See Patterns in Random Data

We Don’t Expect Variability to Depend on Sample Size

We are Fooled by Multiple Comparisons

We Tend to Ignore Alternative Explanations

We Crave Crisp Conclusions, But Statistics Offers Probabilities 

Chapter Summary

 

2. The Complexities of Probability 6 

Basics of Probability

Probability as Prediction of Long-Term Frequency Probability as Strength of Belief (Bayes)

The Distinction Between Probability and Statistics Lingo

Common Mistakes 

Chapter Summary

3. From Sample to Population 11 

Sampling from a Population 

How far to Generalize?

Lingo

Common Mistakes 

Chapter Summary

4. Confidence Intervals 15

Example: Survival of Premature Infants 

Example: Polling Voters

Assumptions: Confidence Interval of a Proportion 

What Does 95% Confidence Really Mean?

Are You Quantifying the Event You Really Care About?

Interpreting Confidence Intervals in Context 

Confidence Intervals for Other Kinds of Data Lingo

Common Mistakes

Q& A

Chapter Summary

5. Types of Variables 28 

Continuous Variables

Ordinal Variables 

Nominal Variables

 Q& A

Chapter Summary

6. Graphing Variability 32

Graphing Data to Show Scatter or Distribution 

Watch Out for Preprocessed Data

Lingo

Common Mistakes

Q& A

Chapter Summary

7. Quantifying Variation 40 

Range

Percentiles 

Interquartile Range 

Five-Number Summary 

Standard Deviation 

Coefficient of Variation 

Lingo

Common Mistakes 

Q& A

Chapter Summary

8. The Gaussian Distribution 46

How the Gaussian Distribution Arises

The Meaning of Standard Deviation in a Gaussian

Distribution

What a Sample Drawn from a Gaussian Distribution Really Looks Like

Why the Gaussian Distribution is so Central to Statistical Theory

Lingo

Common Mistakes 

Q& A

Chapter Summary

9. The Lognormal Distribution and Geometric Mean 52 

Overview

Example: Relaxing Bladders

A Review of Logarithms

The Origin of a Lognormal Distribution 

How to Analyze Lognormal Data Geometric Mean

Lingo

Common Mistakes

Q& A

Chapter Summary

10. Confidence Interval for a Mean 57 

Interpreting a Confidence Interval for a Mean 

What Values Determine the Confidence Interval for a Mean?

The Standard Error of the Mean 

Assumptions: Confidence Interval for a Mean 

Lingo

Common Mistakes

Q& A

Chapter Summary

11. Error Bars 63

The Appearance of Error Bars

How to Interpret Error Bars

Which Kind of Error Bar Should You Plot? 

How are Standard Deviation and Standard Error

of the Mean Related to Sample Size? 

Lingo

Common Mistakes 

Q& A

Chapter Summary

12. Comparing Groups with Confidence Intervals 70 

Using Confidence Intervals to Compare Groups

Examples of Confidence Intervals Used to Compare Groups

Assumuptions of Confidence Intervals 

Common Mistakes

Q& A

Chapter Summary

13. Comparing Groups with P Values 78

Introducing P Values via Coin Flipping

A Rule That Links Confidence Intervals and P Values Revisiting the Examples from Chapter 12

Four Things You Need to Know about P Values Lingo

Common Mistakes

Q& A

Chapter Summary


14. Statistical Significance and Hypothesis Testing 87 

Statistical Hypothesis Testing

Revisiting the Examples from Chapters 12 and 13 

Analogy: Innocent Until Proven Guilty

Extremely Significant? Borderline Significant? Lingo

Choosing a Significance Level

Common Mistakes

Q& A

Chapter Summary

Interpreting Results That are “Statistically Significant” 

Interpreting Results That are “Not Statistically Significant” Five Explanations for “Not Statistically Significant” 

Results 

Lingo

Common Mistakes 

Q& A

Chapter Summary

16. How Common Are Type I Errors? 103 

What Is a Type I Error?

How Frequently Do Type I Errors Occur?

The Prior Probability Influences the False Discovery

Rate (A Bit of Bayes) Analogy to Clinical Testing Lingo

Common Mistakes 

Q& A

Chapter Summary

Why Multiple Comparisons are a Problem 

A Dramatic Demonstration of the Problem with Multiple Comparisons

Multiple Comparisons in Many Contexts 

How to Correct for Multiple Comparisons 

Lingo

Common Mistakes

Q& A

Chapter Summary

18. Statistical Power and Sample Size 119

Ad Hoc Sequential Sample Size Determination Leads to Misleading Results

The Four Questions

Interpreting a Sample Size Statement

A Calculation or a Negotiation?

An Analogy to Understand Statistical Power 

Sample Size and the Margin of Error of the Confidence Interval 

Lingo

Common Mistakes 

Q& A

Chapter Summary

19. Commonly Used Statistical Tests 127 

Assumptions Shared by All Standard Statistical Tests 

Comparing a Continuous Variable Measured in Two Groups

Comparing a Continuous Variable Measured in Three or More Groups

Comparing a Binary Variable Assessed in Two Groups 

Comparing Survival Curves

Correlation and Regression

Lingo

Chapter Summary

20. Normality Tests 134 

Testing for Normality

The Problems with Normality Tests

Alternatives to Assuming a Gaussian Distribution

Lingo

Common Mistakes 

Q& A

Chapter Summary

21. Outliers 138

How Do Outliers Arise?

The Need for Outlier Tests

Five Questions to Ask Before Testing for Outliers 

The Question That an Outlier Test Answers

Is It Legitimate to Remove Outliers?

Lingo

Common Mistakes

Q& A

Chapter Summary

22. Correlation 144

Introducing the Correlation Coefficient 

Assumptions: Correlation

Lingo

Common Mistakes

Q& A

Chapter Summary

23. Simple Linear Regression 152

The Goals of Linear Regression

Linear Regression Results

Assumptions: Linear Regression

Comparison of Linear Regression and Correlation 

Lingo

Common Mistakes 

Q& A

Chapter Summary

24. Nonlinear, Multiple, and Logistic Regression 163 

Nonlinear Regression

Multiple and Logistic Regression 

Lingo

Common Mistakes

Q& A

Chapter Summary

25. Common Mistakes to Avoid When Interpreting Published Statistics 167

Mistake: Not Recognizing Publication Bias 

Mistake: Testing Hypotheses Suggested by the Data 

Mistake: Making a Conclusion about Causation

When the Data Only Show Correlation

Mistake: Over Interpreting Studies That Measure a Proxy or Surrogate Outcome

Mistake: Over Interpreting Data from an Observational Study

Mistake: Being Fooled by Regression to the Mean

26. Review 173

The Fundamental Ideas of Statistics 

Statistical Vocabulary by Chapter