B. Prove It‎ > ‎

2. Explaining and visualizing p-values

Task: Teacher claim video
      Continue work for your video by working out the following tasks.  Show all of your work neatly on paper so it (at least the calculations) will be easy to record / insert pictures into your video later without having to redo it.
      • Check your teacher claim data against the assumptions for a normal curve calculation.
      • Find the z-score for your sample data and use it to calculate the p-value
      • Write an accurate sentence clearly describing your p-value.
      • Write a conclusion statement about your result.
      Mastery Quiz Prep

          SKIM for context: the normal curve and z-scores

          SKIM for context: area under regions of the normal curve

          Bridge from standard normal curve use to use for hypothesis testing

          Calculating a z-score and p-value with the normal curve


          Plug and chug calculation practice -- find the p-value using the normal curve.  Make sure you sketch a normal curve and fill in the appropriate ares.
          1. H0: p=0.66, HAp<0.66, =53/102
          2. H0: p=0.49, HAp0.49, =500/1027
          3. H0: p=72%, HAp<72%, =99/145

          Explaining P-values in a sentence


          Explain in non-technical language the meaning of a P-value:
          • If the proportion of _[your variable]_ were really _[your null]_, our sample proportion of _[your sample mean]_ or _[less/more/more extreme]_ would occur _[100*p-value]_% of the time by chance.1
          • Example: Imagine that a friend claims that half of your school eats cheese weekly.  You think she is wrong, so you run a test with the null hypothesis p=0.5 and the alternative p ≠ 0.5.  Your sample proportion of 0.61 resulted in a p-value of 0.13.
            • Interpretation: If the proportion of your school that eats cheese weekly were really 0.5, our sample proportion of 0.61 or more extreme would occur 13% of the time by chance.
          1: adapted from samples in the REA AP Statistics book.

          P-value interpretation
          For 4-6, use problems 1-3 and the p-values you calculated to explain each of your p-values in sentences.

          Full-circle: revisiting old problems and the full cycle but now using the normal curve for p-value calculations.
          • a) What is the claim?
          • b) What is the researcher's thought about the claim?
          • c) If you were to go out and challenge the claim, what question would you ask the individuals / what would you observe in each individual?
          • d) What type of data (quantitative or categorical) will this question produce?  How is it summarized (mean or proportion)?
          • e) What is the null hypothesis / H0?  Use symbols and subscripts that describe the variable (such as μat bats per game or plike pizza).
          • f) What is the alternative hypothesis / HA?  Again, use symbols and subscripts.
          • g) What is a reasonable cut-off p-value to reject the null (what is α)?
          • h) What test will you perform (mean vs. proportion)?  Left, right, or two-tailed test?
          • i) What assumptions do you need to make before performing the test?  Check them.
          • j) Find the p-value.
          • k) Decide whether or not to reject the null hypothesis.
          • l) Does this make your data statistically significant?
          • m) State your decision in context in a short sentence.
          • n) If you failed to reject, was you p-value strong enough to warrant redoing the study with a larger sample size?
          • o) If you rejected, did the difference you found between the sample mean/proportion and the claimed mean/proportion seem like a meaningful difference?
          • p) Imagine that you found out from a census of all of the data sometime later that you made the wrong decision.  What type of error would you have made (Type I or II)?
          7) A regional softball league claimed that half of its pitchers could pitch a ball over 65mph.  A competing league thought this value was too high, so they looked at a simple random sample of 28 pitchers and found only 10 who could throw this fast.

          8) Create a 90% confidence interval for the following sample data: =37/87
          Now use this confidence interval to decide whether or not to reject the null hypothesis for the alternative: H0: p=.35, HA: p>.35, when α = .05.  Explain your reasoning.

          Free Response Prep
              (A) Compare and contrast bootstrapping (via StatKey) and using a mathematical model (the normal curve).


              (B) Imagine that someone claimed that the average temperature in Byron, MN was 52 degrees.  If I challenged this by saying it was not 52 degrees, my sample average was 46 degrees, and my test obtained a p-value of 0.12, how would you describe my p-value in a clear sentence?  Do not be too general, as it must address the this specific null hypothesis and this specific sample result.


              (C) Using the normal curve, explain how you could perform the equivalent of a two-sided hypothesis test (α = .05) using a confidence interval.

              Practice solutions
                  1. Test of the proportion, need a z-score for 53/102
                    z = ( (53/102) - 0.66 ) / √( (53/102) * (1-53/102) / 102 ) = -0.140 / 0.049 = -2.84
                    Want to know the area under the normal curve less than (see alternative hypothesis) z=-2.84
                    p-value = normalcdf(-99,-2.84) = 0.0022

                  2. Test of the proportion, need a z-score for 500/1027
                    z = ( (500/1027) - 0.49 ) / √( (500/1027) * (1-500/1027) / 1027 ) = -0.00315 / 0.0156 = -0.202
                    Want to know the area under the normal curve more extreme than (see alternative hypothesis) z=-0.202. One way to do this is find the area less than a this, because it is a negative z-score, and then double it.
                    p-value = 2 * normalcdf(-99,-0.202) = 2 * 0.420 = 0.840

                  3. Test of the proportion, need a z-score for 99/145
                    z = ( (99/145) - 0.72 ) / √( (99/145) * (1-99/145) / 145 ) = -0.0372 / 0.0386 = -0.964
                    Want to know area under of the normal curve less than (see alternative hypothesis) z=-0.964
                    p-value = normalcdf(-99,-0.964) = 0.168

                  4. (#1) There is a 0.2% chance of finding a sample proportion less than 53/102 when we assume p=0.66.

                  5. (#2) There is a 84% chance of finding a sample proportion as extreme as 500/1027 when we assume p=0.49.

                  6. (#3) There is a 16.8% chance of finding a sample proportion less than 99/145 when we assume p=72%.
                  7. 7. Softball pitchers
                    a) Half of the pitchers in the regional softball league throw over 65mph
                    b) Less than half of the pitchers throw this fast
                    c) Observe each pitcher with a radar gun to decide "do you throw over 65mph or not?"
                    d) Categorical (they do or they don't throw over 65mph), summarized by a proportion
                    e) pthrows over 65mph = 0.5
                    f) pthrows over 65mph < 0.5
                    g) 0.05 (the default unless there is reason for something else)
                    h) Test for Single Proportion (left tailed test)
                    i) You need to check that the data was gathered randomly, and the problem said you used an SRS.
                    j) Get your z-score first: z = -1.578
                    p-value = normalcdf(-99,-1.578) = 0.065
                    k) Fail to reject
                    l) Not statistically significant
                    m) We did not find enough evidence to prove that less than half of the softball pitchers throw over 65mph.
                    n) Yes -- 0.09 is getting very close, and if you were really out to prove something, a larger sample size might help you lower your p-value under 0.05.
                    o) n/a
                    p) Type II error (because you failed to reject)

                    8. Confidence interval is .425 +/- .087, which as an interval is 0.338 to 0.512.  The null hypothesis, 0.35, falls INSIDE the interval.  This means that it is a plausible thing to expect as a proportion, so we do NOT reject it.  We chose a 90% confidence interval because that leaves 5% left in each tail.  If the null proportion fell in the lower tail, we would have said there less than a 5% chance of getting a sample proportion this much larger than the null when we assume it is true, aka we would reject.  Ask lots of questions about this because it is complicated at first.  Also see videos for essay questions.
                  Practice quiz (we will go over answers in class as a group)
                        1. Calculate the p-value and explain it in a sentence: H0: p=72%, HAp72%, =99/145

                        2. An article about coaching said that 18% of teachers are involved in some form of coaching.  A group of teachers in the HVL thought that a higher percentage of teachers in their conference coached, so they ran a stratified random sample of teachers.  Of the 80 teachers they reached, 28 were coaching a sport or activity that year.  Go through the full process below.
                        • a) What is the claim?
                        • b) What is the researcher's thought about the claim?
                        • c) If you were to go out and challenge the claim, what question would you ask the individuals / what would you observe in each individual?
                        • d) What type of data (quantitative or categorical) will this question produce?  How is it summarized (mean or proportion)?
                        • e) What is the null hypothesis / H0?  Use symbols and subscripts that describe the variable (such as μat bats per game or plike pizza).
                        • f) What is the alternative hypothesis / HA?  Again, use symbols and subscripts.
                        • g) What is a reasonable cut-off p-value to reject the null (what is α)?
                        • h) What test will you perform (mean vs. proportion)?  Left, right, or two-tailed test?
                        • i) What assumptions do you need to make before performing the test?  Check them.
                        • j) Find the p-value.
                        • k) Decide whether or not to reject the null hypothesis.
                        • l) Does this make your data statistically significant?
                        • m) State your decision in context in a short sentence.
                        • n) If you failed to reject, was you p-value strong enough to warrant redoing the study with a larger sample size?
                        • o) If you rejected, did the difference you found between the sample mean/proportion and the claimed mean/proportion seem like a meaningful difference?
                        • p) Imagine that you found out from a census of all of the data sometime later that you made the wrong decision.  What type of error would you have made (Type I or II)?

                        Vocabulary
                            p-value- the probability of obtaining the statistic (mean or proportion) you did assuming that the null hypothesis is true

                            Notes

                            New P-Value


                            Comments