statrefs home‎ > ‎Main‎ > ‎Books and Data Sets‎ > ‎

Statistical Rules of Thumb (van Belle)

 Author(s)  Gerald van Belle
 Title  Statistical Rules of Thumb
 Year  2002
 Publisher  John Wiley & Sons, Inc.
 ISBN  0-471-40227-3
 book link

Table of Contents



1. The Basics.

1.1 Four Basic Questions.

1.2 Observation is Selection.

1.3 Replicate to Characterize Variability.

1.4 Variability Occurs at Multiple Levels.

1.5 Invalid Selection is the Primary Threat to Valid Inference.

1.6 There is Variation in Strength of Inference.

1.7 Distinguish Randomized and Observational Studies.

1.8 Beware of Linear Models.

1.9 Keep Models As Simple As Possible, But Not More Simple.

1.10 Understand Omnibus Quantities.

1.11 Do Not Multiply Probabilities More Than Necessary.

1.12 Use Two-sided p-Values.

1.13 p-Values for Sample Size, Confidence Intervals for Results.

1.14 At Least Twelve Observations for a Confidence Interval.

1.15 Estimate ± Two Standard Errors is Remarkably Robust.

1.16 Know the Unit of the Variable.

1.17 Be Flexible About Scale of Measurement Determining Analysis.

1.18 Be Eclectic and Ecumenical in Influence.

2. Sample Size.

2.1 Begin with a Basic Formula for Sample Size-Lehr’s Equation.

2.2 Calculating Sample Size Using the Coefficient of Variation.

2.3 No Finite Population Correction for Survey Sample Size.

2.4 Standard Deviation and Sample Range.

2.5 Do Not Formulate a Study Solely in Terms of Effect Size.

2.6 Overlapping Confidence Intervals Do Not Imply Nonsignificance.

2.7 Sample Size Calculation for the Poisson Distribution.

2.8 Sample Size for Poisson with Background Rate.

2.9 Sample Size Calculation for the Binomial Distribution.

2.10 When Unequal Sample Sizes Matters; When They Don’t.

2.11 Sample Size With Different Costs for the Two Samples.

2.12 The Rule of Threes for 95% Upper Bounds When There Are No Events.

2.13 Sample Size Calculations Are Determined by the Analysis.

3. Observational Studies.

3.1 The Model for an Observational Study is the Sample Survey.

3.2 Large Sample Size Does Not Guarantee Validity.

3.3 Good Observational Studies Are Designed.

3.4 To Establish Cause Effect Requires Longitudinal Data.

3.5 Make Theories Elaborate to Establish Cause and Effect.

3.6 The Hill Guidelines Are a Useful Guide to Show Cause Effect.

3.7 Sensitivity Analyses Assess Models Uncertainty and Missing Data.

4. Covariation.

4.1 Assessing and Describing Covariation.

4.2 Don’t Summarize Regression Sampling Schemes.

4.3 Do Not Correlate Rates or Ratios Indiscriminately.

4.4 Determining Sample Size to Estimate a Correlation.

4.5 Pairing Data is not Always Good.

4.6 Go Beyond Correlation in Drawing Conclusions.

4.7 Agreement As Accuracy, Scale Differential, and Precision.

4.8 Assess Test Reliability by Means of Agreement.

4.9 Range of the Predictor Variable and Regression.

4.10 Measuring Change: Width More Important than Numbers.

5. Environmental Studies.

5.1 Begin with the Lognormal Distributions in Environmental Studies.

5.2 Differences Are More Symmetrical.

5.3 Know the Sample Space for Statements of Risk.

5.4 Beware of Pseudoreplication.

5.5 Think Beyond Simple Random Sampling.

5.6 The Size of the Population and Small Effects.

5.7 Models of Small Effects Are Sensitive to Assumptions.

5.8 Distinguish Between Variability and Uncertainty.

5.9 Description of the Database is As Important as Its Data.

5.10 Always Assess the Statistical Basis for an Environmental Standard.

5.11 Measurement of a Standard and Policy.

5.12 Parametric Analyses Make Maximum Use of the Data.

5.13 Confidence, Prediction, and Tolerance Intervals.

5.14 Statistics and Risk Assessment.

5.15 Exposure Assessment is the Weak Link in Assessing Health Effects of Pollutants.

5.16 Assess the Errors in Calibration Due to Inverse Regression.

6. Epidemiology.

6.1 Start with the Poisson to Model Incidence or Prevalence.

6.2 The Odds Ratio Approximates the Relative Risk Assuming the Disease is Rare.

6.3 The Number of Events is Crucial in Estimating Sample Size.

6.4 Use a Logarithmic Formulation to Calculate Sample Size.

6.5 Take No More than Four or Five Controls per Case.

6.6 Obtain at Least Ten Subjects for Every Variable Investigated.

6.7 Begin with Two Exponential Distribution to Model Time to Event.

6.8 Begin with Two Exponentials for Comparing Survival Times.

6.9 Be Wary of Surrogates.

6.10 Prevalence Dominates in Screening Rare Diseases.

6.11 Do Not Dichotomize Unless Absolutely Necessary.

6.12 Additive and Multiplicative Models.

7. Evidence-Based Medicine.

7.1 Strength of Evidence.

7.2 Relevance of Information: POEM vs. DOE.

7.3 Begin with Absolute Risk Reduction, then follow with Relative Risk.

7.4 The Number Needed to Treat (NNT) is Clinically Useful.

7.5 Variability in Response to Treatment Needs to be Considered.

7.6 Safety is the Weak Component of EBM.

7.7 Intent to Treat is the Default Analysis.

7.8 Use Prior Information but not Priors.

7.9 The Four Key Questions for Meta-analysis.

8. Design, Conduct, and Analysis.

8.1 Randomization Puts Systematic Effects into the Error Term.

8.2 Blocking is the Key to Reducing Variability.

8.3 Factorial Designs and Joint Effects.

8.4 High-Order Interactions Occur Rarely.

8.5 Balanced Designs Allow Easy Assessment of Joint Effects.

8.6 Analysis Follows Designs.

8.7 Independence, Equal Variance, and Normality.

8.8 Plan to Graph the Results of an Analysis.

8.9 Distinguish Between Design Structure and Treatment Structure.

8.10 Make Hierarchical Analyses the Default Analysis.

8.11 Distinguish Between Nested and Crossed Designs-Not Always Easy.

8.12 Plan for Missing Data.

8.13 Address Multiple Comparisons Before Starting the Study.

8.14 Know Properties Preserved When Transforming Units.

8.15 Consider Bootstrapping for Complex Relationships.

9. Words, Tables, and Graphs.

9.1 Use Text for a Few Numbers, Tables for Many Numbers, Graphs and Complex Relationships.

9.2 Arrange Information in a Table to Drive Home the Message.

9.3 Always Graph the Data.

9.4 Always Graph Results of An Analysis of Variance.

9.5 Never Use a Pie Chart.

9.6 Bar Graphs Waste Ink; They Don’t Illuminate Complex Relationships.

9.7 Stacked Bar Graphs Are Worse Than Bar Graphs.

9.8 Three-Dimensional Bar Graphs Constitute Misdirected Artistry.

9.9 Identify Cross-sectional and Longitudinal Patterns in Longitudinal Data.

9.10 Use Rendering, Manipulation, and Linking in High-Dimensional Data.

10. Consulting.

10.1 Session Has Beginning, Middle, and End.

10.2 Ask Questions.

10.3 Make Distinctions.

10.4 Know Yourself, Know the Investigator.

10.5 Tailor Advice to the Level of the Investigator.

10.6 Use Units the Investigator is Comfortable With.

10.7 Agee on Assignment of Responsibilities.

10.8 Any Basic Statistical Computing Package Will Do.

10.9 Ethics Precedes, Guides, and Follows Consultation.

10.10 Be Proactive in Statistical Consulting.

10.11 Use the Web for Reference, Resource, and Education.

10.12 Listen to, and Heed the Advice of Experts in the Field.



Author Index.

Topic Index.

SelectionFile type iconFile nameDescriptionSizeRevisionTimeUser