Review

Sampling distributions:
  • Know how to get a real sampling distribution -- take zillions of random samples, each of the SAME size, from the same population and graph the means (or proportions for categorical data).
  • Know why it is silly to generate a real sampling distribution in actual data collection.
  • Know how StatKey goes about generating an approximate sampling distribution and why this is a decent approximation.
  • Understand the two core powers of the Central Limit Theorem.
  • Describe what the sampling distribution of means approximately looks like at different sample sizes (state the mean, standard deviation, and shape).
    • If the starting distribution has a mean of 4.2, standard deviation of 3.2, and is very bimodal, what will the sampling distribution look like taking samples of size 14?
    • If the starting distribution is very skewed, has a mean of 120.2, and has a standard deviation of 14.4, what will the sampling distribution look like taking samples of size 75?
  • Know why we check the 3 assumptions when using the normal curve to approximate the sampling distribution of means.
  • Know why we have a slightly different assumption to check before generating a sampling distribution for proportions.
  • Understand that the standard error is just the standard deviation of the sampling distribution.


Confidence intervals:
  • Understand that a confidence interval is a sampling distribution with the extremes chopped off (keeping the middle 95%, 99%, etc.).
  • Understand the basic formula for all confidence intervals.
    • What is the role of the sample mean or proportion in a confidence interval?
    • The margin of error includes z* and the standard error.  Why is each part necessary?
  • Understand that the minimum sample size formulas come from rearranging the margin of error part of the confidence interval formula.
  • Correctly use the minimum sample size formula for means.
    • A pilot study found a mean of 192cm and standard deviation of about 31.4cm.  If you want to collect data from this population to estimate the mean within a margin of error of 3cm, what is the minimum sample size expected?
  • Correctly use the minimum sample size formula for proportions.  Know to use p*=0.5 when there is no pilot study.
    • If you want to find a margin of error of only 4% at 99% confidence, at least how many people do you need in your sample.
  • Calculate a confidence interval for a mean.
    • From an SRS of 38 football players, the average weight was 214 lbs with a standard deviation of 43 lbs.  Calculate a 95% confidence interval.
  • Calculate a confidence interval for a proportion.
    • From a stratified random sample of 128 golfers, 70 said they were having fun.  Calculate a 90% confidence interval for the proportion having fun.


Hypothesis tests:
  • Explain the structure and goal of a hypothesis test (take someone's claim, assume it true, find the likelihood of finding your data, use that value to decide if the claim must be untrue and reject it).
  • Given the possibility of Type I and II errors when making a decision, think through which error might be worse.
    • When testing for a disease, if the null hypothesis is that you have the disease, would a Type I or II error be worse?
  • 0.05 is the typical alpha value.  Why?  What should you think when you get a p-value of 0.06?
  • Why is a two-tailed test generally a bad approach unless you truly have no reason to guess either direction for the alternative hypothesis?
  • A hypothesis test proves a statistically significant difference exists between the null hypothesis and the sample data.  Sometimes, it proves a difference that have no practical importance.  Why do I mean when I say that?
  • Given a scenario, be able to work through this entire process:
    • What is the claim?
    • What is the researcher's thought about the claim?
    • If you were to go out and challenge the claim, what question would you ask the individuals / what would you observe in each individual?
    • What type of data (quantitative or categorical) will this question produce?  How is it summarized (mean or proportion)?
    • What is the null hypothesis / H0?  Use symbols and subscripts that describe the variable (such as μat bats per game or plike pizza).
    • What is the alternative hypothesis / HA?  Again, use symbols and subscripts.
    • What is a reasonable cut-off p-value to reject the null (what is α, unless given to you)?
    • What test will you perform (mean vs. proportion)?  Left, right, or two-tailed test?
    • What assumptions do you need to make before performing the test?  Check them.
    • Find the p-value using StatKey
    • Find the z-score for your sample mean/proportion
    • Find the p-value using the z-score and the normal curve
    • Describe the p-value in a sentence.
    • Decide whether or not to reject the null hypothesis.
    • Does this make your data statistically significant?
    • State your decision in context in a short sentence.
    • Imagine that you found out from a census of all of the data sometime later that you made the wrong decision.  What type of error would you have made (Type I or II)?
  • Example scenarios for the process above:
    • A factory manager claimed that production of Nerf darts resulted in darts that were an average of 2.30 inches long.  Before making a large order, you suspected that it might be wrong.  You randomly selected 40 darts off the end of the line during the next day to collect your own sample and check the manager.  You found a sample average of 2.24 inches with a standard deviation of 0.13 inches.  To be careful to not falsely accuse the manager of being untruthful or having a defective factory, you choose an alpha value of 0.01.
    • At Twitter Math Camp, a speaker said that 8% of U.S. math teachers had a Twitter account.  You think that estimate is too high and collect an SRS of 120 U.S. math teachers.  6 said they had an account.


Practicing with wordy paragraphs:

A long time sportscaster said that 20% of major league baseball players had used performance enhancing drugs at one point in their career.  Since you thought it was higher, you decided to investigate.  Major League Baseball has a total of 1200 players on the September rosters.  Of these, you sampled 60 players and asked them if they had ever used these drugs.  Of your sample, 8 admitted to using these drugs.  

A group of avid players of the awesome video game Starcraft II wanted to know the average time until the first full attack in a game.  They had read that is was 6 minutes, but they thought it was longer and decided to collect their own data.  The SRS of 34 games resulted in an average of 7.1 minutes with a standard deviaiton of 8 minutes.

A pilot study of the weights of newborns found an average of 7.6 lbs with a standard deviation of 0.9 lbs.  If you wanted to get the margin of error of a 99% confidence interval under 0.1 lbs, at least how large of a sample would you need?

Comments