NHST using the normal approximation for the binomial

When the Binomial Distribution Approaches the Normal

As we learned earlier, the p-value for a hypothesis test for a discrete binomial distribution comes from a binomial table where the probability of a given number of successes is computed (and usually summed). It can also be assessed through the binomial mean which is the population proportion.

As the probability of success (p) approaches .5 the binomial distribution approaches a symmetric distribution, and if the sample size is large enough, this symmetric distribution approaches a normal distribution. Thus, the normal distribution is an appropriate model for this sampling distribution if the expected number of success and failures are both at least 10. Using the symbols for the population proportion and sample size, a normal curve is a reasonable model if the following conditions are met:

1) np ≥ 10 and

2) nq ≥ 10.

Example: Health Insurance Coverage

According to the Government Accountability Office, 80% of all college students (ages 18 to 23) had health insurance in 2006. The Patient Protection and Affordable Care Act (ACA) of 2010 allowed young people under age 26 to stay on their parents’ health insurance policy. Has the proportion of college students (ages 18 to 23) who have health insurance increased since 2006? A survey of 800 randomly selected college students (ages 18 to 23) indicated that 83% of them had health insurance. Use a 0.05 level of significance.

Step 1: Determine the hypotheses.

H0: p ≤ 0.80

where p is the proportion of college students ages 18 to 23 who have health insurance now.

There was no effect of the ACA on college students having health insurance: The proportion of college students with health insurance has not changed.

Ha: p > 0.80

where p is the proportion of college students ages 18 to 23 who have health insurance now.

There was an effect of the ACA on college students having health insurance: The proportion of college students with health insurance has increased.

Step 2: State the Decision Criterion

We will use an alpha of .05.

This is a right-tailed test since the alternative hypothesis expects the proportion to increase (>).

Step 3: Collect the data

In this random sample of 800 college students, 83% have health insurance. This means that 800(.083) = 664 college students out of the 800 had health insurance.

Step 4: Compute the test statistic

If 80% of all college students have health insurance, is this 3% difference statistically significant or due to chance? We need to find a p-value to answer this question. We must determine if we can use this data in a hypothesis test.

First note that the data are from a random sample. That is essential. Now we need to determine if a normal model is a good fit for the sampling distribution. Since we assume that the null hypothesis is true, we build the sampling distribution with the assumption that 0.80 is the population proportion. We check the following conditions, using 0.80 for p:

np = (800)(0.80) = 640 and nq = (800)(1−0.80) = 160

Because these are both more than 10, we can use the normal model to find the p-value.

Now that we know that the normal distribution is an appropriate model for the sampling distribution, our next goal is to determine the p-value. The first step is to determine the z-score for the observed sample proportion (the data).

Since we are dealing with the sample proportion (0.83) rather than number of successes we can either convert the proportions to successes and use the z-score formula for sample means:

where we are using the number of observed successes for the mean of the sample and we know that the mean of the binomial distribution of np and the standard deviation of the binomial distribution is the square root of npq, giving us:

or we can use the z-score formula for proportions:

For this example, we will compute and discuss in terms of both proportions and successes, though only one is needed, to show that the equations and the interpretations are equivalent. Thus, in our example we would have

This z-score is called the test statistic. It tells us the sample proportion of 0.83 is about 2.12 standard errors above the population proportion given in the null hypothesis. Alternately, if you compute and focus on successes, it is telling us that the sample of 0.83*800 = 664 successes out of 800 people is about 2.12 standard errors above the population mean given in the null hypothesis.

We then use this statistic to find the p-value. Recall that the p-value is a probability that describes the likelihood of the data if the null hypothesis is true. More specifically, the p-value is the probability that sample results are as extreme as or more extreme than the data if the null hypothesis is true. The phrase “as extreme as or more extreme than” means farther from the center of the sampling distribution in the direction of the alternative hypothesis.

Thus, in this example, the p-value is the probability that our observed proportion (or success) or greater would occur if there was no effect of the ACA (i.e., if the null hypothesis is true). That means we want the area to the right of 0.83 because the alternative hypothesis is a “greater-than” statement. The p-value, in this case, is the probability of getting a sample proportion equal to or greater than 0.83. Since we are using the standard normal curve to find probabilities, the p-value is the area to the right of the z = 2.12 (see below).

Looking at our unit normal table we find that the p-value is approximately 0.0170. Thus, the probability that a random sample proportion is at least as large as 0.83 is about 0.017 (if the population proportion is actually 0.80). If the null hypothesis is true, we observe sample proportions this high or higher only about 1.7% of the time. This indicates that this is a rare event.

Step 5: State a conclusion.

To determine our conclusion, we compare the p-value to the level of significance, α = 0.05. If our data are predicted to occur by chance less than 5% of the time, we have reason to reject the null hypothesis and accept the alternative. Since our P-value of 0.017 is less than 0.05, we reject the null hypothesis. We state our conclusion in terms of the alternative hypothesis. We also state it in context.

Since our observed p-value of 0.0170 is less than our stated alpha of 0.05, we know that our statistic is in our critical region and we reject the null hypothesis, indicating that we find support for the alternative hypothesis. Thus, the data from this study provides evidence that the proportion of all college students who have health insurance is now significantly greater than 0.80. The 0.03 increase in the proportion who have health insurance since 2008 is statistically significant at the 0.05 level.

Alternatively, we can give the conclusion using the percentage rather than the decimal:

Thus, the data from this study provides evidence that the proportion of all college students who have health insurance is now significantly greater than 0.80. The 0.03 increase in the proportion who have health insurance since 2008 is statistically significant at the 0.05 level.

NOTE. This procedure could also be done with the critical value method. In this case, with an alpha of .05 and a right-tailed test we would have a critical z-score of 1.645. Then, in step 5 would compare our observed z-score of 2.12 to the critical z-score of 1.645 and find that we reject the null hypothesis and arrive at the same conclusion as above.

Example 2: Internet Access

According to the Kaiser Family Foundation, 84% of U.S. children ages 8 to 18 had Internet access at home as of August 2009. Researchers wonder if this percentage has changed since then. They survey 500 randomly selected children (ages 8 to 18) and find that 430 of them have Internet access at home.

Step 1: Determine the hypotheses.

H0: p = 0.84

where p is the proportion of children ages 8 to 18 with Internet access at home now.

There was no change in the proportion of children aged 8 to 18 who had Internet access since 2009.

Ha: p ≠ 0.84

where p is the proportion of children ages 8 to 18 with Internet access at home now.

There was a change in the proportion of children aged 8 to 18 who had Internet access since 2009.

Step 2: State the Decision Criterion

We will use an alpha of .05.

This is a two-tailed test since the alternative hypothesis expects the proportion to change but does not specify an expectation of how the proportion will change (≠).

Step 3: Collect the data.

In this random sample of 500 children aged 8 to 18, 430 had Internet access at home, meaning that we found a proportion of 430/500 = 0.86.

Step 4: Compute the test statistic.

If 86% of all children aged 8 to 18 have Internet at home, is this 2% difference statistically significant or due to chance? We need to find a p-value to answer this question. We must determine if we can use this data in a hypothesis test.

First note that the data are from a random sample. That is essential. Now we need to determine if a normal model is a good fit for the sampling distribution. Since we assume that the null hypothesis is true, we build the sampling distribution with the assumption that 0.84 is the population proportion. We check the following conditions, using 0.84 for p:

np = (500)(0.84) = 420 and nq = (500)(1−0.84) = 80

Because these are both more than 10, we can use the normal model to find the p-value.

Now that we know that the normal distribution is an appropriate model for the sampling distribution, our next goal is to determine the p-value. The first step is to determine the z-score for the observed sample proportion (the data).

The sample proportion of 0.86 is about 1.22 standard errors above the population proportion given in the null hypothesis. Now we calculate the p-value. This is where the two-tailed nature of the test is important.

The p-value is the probability of seeing a sample proportion at least as extreme as the one observed from the data if the null hypothesis is true. In the previous example, only sample proportions higher than the null proportion were evidence in favor of the alternative hypothesis. In this example, any sample proportion that differs from 0.84 is evidence in favor of the alternative. Statistically significant differences are at least as extreme as the difference we see in the data. We want to determine the probability that the difference in either direction (above or below 0.84) is at least as large as the difference seen in the data, so we include sample proportions at or above 0.86 and sample proportions at or below 0.82. For this reason, we look at the area in both tails. Our simulation shows one tail, and we have a symmetric distribution, we have to double this area.

Looking in our unit normal table, we find that the area above the test statistic of 1.22 is about 0.11. We double this area to include the area in the left tail, below z = −1.22. This gives us a p-value of approximately 0.22.

Our sample proportion was 0.02 above the population proportion from the null hypothesis. In a sample of size 500, we would observe a sample proportion 0.02 or more away from 0.84 about 22% of the time by chance alone.

Step 5: State a conclusion.

Again we compare the p-value to the level of significance, α = 0.05. In this case, the p-value of 0.22 is greater than 0.05, which means we do not have enough evidence to reject the null hypothesis. A sample result that could occur 22% of the time if the null hypothesis is true, is not statistically significant. Now we can state the conclusion in terms of the alternative hypothesis.

The data from this study does not provide evidence that is strong enough to conclude that the proportion of all children ages 8 to 18 who have Internet access at home has changed since 2009. The 2% change observed in the data is not statistically significant. These results can be explained by predictable variation in random samples.

A Note about two-tailed tests

Since we are using a normal distribution we can either double the computed p-value in one tail to give us the overall p-value in both tails, and then compare the p-value to our stated alpha value or we can compare our computed p-value in one tail to the alpha level in one tail (alpha/2). In this case we could compare the p-value of 0.11 to the alpha level in that tail, 0.025. and we would find the same result: a failure to reject the null hypothesis.

A Note about the Conclusion

In the conclusion above, we did not have enough evidence to reject the null hypothesis. Notice that failing to reject the null hypothesis does not mean the null hypothesis is true or accepted. In the case of the previous example, it is possible that the proportion of children who have Internet access at home has changed. But the data we gathered did not provide the evidence to detect that the proportion had changed significantly.

Researchers often note improvements that could be made in their research and suggest follow-up research that might be done. In our example, a second sample with a larger sample size might provide the evidence needed to reject the null hypothesis.

The important thing to keep in mind is that at the end of a hypothesis test, we never say that the null hypothesis is true.

References:

  1. https://courses.lumenlearning.com/wmopen-concepts-statistics/chapter/hypothesis-test-for-a-population-proportion-2-of-3/

CC LICENSED CONTENT, SHARED PREVIOUSLY