10. Statistics

Course Description:

This course introduces students to the collection and analysis of data by rational principles. We will discuss how to recognize sources of bias in data presented in the real world, and how to collect unbiased data ourselves. We will study how best to present and summarize data, using Microsoft Excel as our basic tool. We will study basic probability in depth as it is a foundation for Statistical reasoning. Finally, most of the second semester will be spent studying the bread-and-butter of statistics: confidence intervals, margins of error, hypothesis testing, and regression analysis in a wide variety of contexts.

The prerequisite for this class is Algebra 2, although it is generally recommended that students take Precalculus before taking Statistics.


The textbook is:

Barr, Diez, Dorazio, Cetinkaya-Rundel: Advanced High School Statistics, 1st Edition


The teacher is:

Peter Mannisto


Detailed Course Topic List:

    1. Data collection

      • Population vs. parameter, parameter vs. statistic

      • Observational study vs. experiment

      • Sources of bias when collecting data

      • Correlation vs. causation and lurking variables

      • Control and randomization in experiments

      • Placebo effect

    2. Summarizing data

      • Intro to Microsoft Excel

      • Visual displays of data: histograms, scatterplots, dot plots, etc.

      • Shape of data: skew, modality, outliers

      • Measures of center: mean, median and relations between them

      • Measures of spread: standard deviation, range, and IQR

      • Mapping data

      • Summarizing categorical data: two-way tables, bar charts

    3. Probability

      • Definition of probability

      • Law of Large Numbers

      • Addition and Multiplication rules for probability

      • Conditional probability and relation to independence

      • Visualizing probability with tree diagrams

      • Bayes' Theorem

      • Binomial Formula

      • Simulations

    4. Probability distributions

      • Definition of a random variable

      • Probability distribution of a random variable

      • Expected value, variance, standard deviation

      • Discrete vs. continuous distributions

      • Binomial distribution

      • Normal distribution

      • Central Limit Theorem

      • Geometric distribution (if time)

    5. Foundations of Statistical Inference

      • Point estimate of a statistics

      • Definition of a confidence interval and margin of error

      • Choosing a confidence level

      • Basic terminology of hypothesis testing

      • Type I and Type II errors

    6. Inference for Categorical Data

      • Conditions for applying a normal approximation to categorical data

      • Inference for a single proportion and difference of two proportions

      • Testing Goodness of Fit and independence of data with the Chi-square distribution

    7. Inference for Numerical Data

      • The t-distribution: definition and use in statistics

      • Calculating confidence intervals and margins of error using the t-distribution

      • Hypothesis testing for a single mean, paired data, and difference of two means

      • Comparing many means with ANOVA (if time)

    8. Introduction to Regression analysis

      • Definition of the least-squares fit line

      • Correlation coefficient and strength of fit

      • Using the regression line for interpolation

      • Dangers of extrapolation

      • Effect of outliers on linear regression

      • Conditions for inference in linear regression

      • Interpretation of slope and intercept in context

      • Transforming non-linear data to apply linear regression