Hypothesis test for a population proportion

In the previous section, we introduced the concept of null hypothesis significance testing.

Although we will follow the five NHST steps we examined in the previous section, the specifics for each research design and distribution are different. In this section, we look at the hypothesis test for a single population proportion. When we conduct a test about a population proportion, we are working with a categorical variable, often we are working with the binomial distribution. Later in the course, after we have learned a variety of hypothesis tests, we will need to be able to identify which test is appropriate for which situation. Identifying the variable as continuous or discrete is an important component of choosing an appropriate hypothesis test.


__________________________________________

EXAMPLE 1

The following is an example of a question about a population proportion.

  1. According to the College Board, 62% of students graduating from a community college with an associate degree in 2007-2008 had no student loan debt. Has this figure increased since then?

These would not be good population proportion questions.

  1. Is there a difference in the proportion of males who have student loan debt and the proportion of females who have student loan debt when they graduate from community college?

  2. According to the National Postsecondary Student Aid Study, conducted by the National Center for Education Statistics, the average student loan debt for community college students who received an associate degree was $5,879. Has the average student loan debt for these students increased since them?

  3. What proportion of students at Naugatuck Valley Community College commute more than 5 miles each way to campus?

__________________________________________

Once we know that we are dealing with a single population proportion, we can conduct the hypothesis test. Recall that the first step of a hypothesis test is to determine the hypotheses. In the previous section, our hypotheses were in words. In this section, we use symbols. Recall that the symbol for the population proportion is p. See below for some examples.

__________________________________________

EXAMPLE 2

According to the Government Accountability Office, 80% of all college students ages 18 to 23 had health insurance coverage in 2006. The Patient Protection and Affordable Care Act passed in 2010 allowed young people under age 26 to stay on their parents’ health insurance policy. Has the proportion of college students ages 18 to 23 who have health insurance increased since 2006? A survey of 800 randomly selected college students ages 18 to 23 indicated that 83% of them had health insurance coverage.

  • H0: p ≤ 0.80 (No change; the proportion of college students ages 18 to 23 who have health insurance is still 80%.)

  • Ha: p > 0.80 (The proportion of college students ages 18 to 23 who have health insurance is now greater than 80%.)

The results of the survey do not affect our hypotheses. We use the results to determine whether to reject the null hypothesis in favor of the alternative hypothesis.

__________________________________________

EXAMPLE 3

According to the Kaiser Family Foundation, 84% of U.S. children ages 8 to 18 had Internet access at home as of August 2009. Researchers wonder if this percentage has changed since then. They survey 500 randomly selected children ages 8 to 18 and find that 430 of them have Internet access at home. The research question helps us form our hypotheses:

  • H0: p = 0.84 (No change; the proportion of children with Internet access at home is the same.)

  • Ha: p ≠ 0.84 (The proportion of children with Internet access at home has changed since 2009.)

Again, the results of the survey do not affect our hypotheses.

__________________________________________

EXAMPLE 4

Jefferson Parish is a suburb of New Orleans, Louisiana. Its population is about 23% African American. Is there evidence that African Americans are underrepresented on juries in murder trials in Jefferson Parish? According to a New York Times article (June 4, 2007), there were 18 murder trials in Jefferson Parish between 1986 and 2007 in which the ethnicity of the jurors was known. Ten of the juries had no black jurors, 7 juries had 1 black juror, and 1 jury had 2 black jurors. The research question helps us to form our hypotheses:

  • H0: p ≥ 0.23 (No difference; the proportion of African Americans on juries in murder trials is the same as the proportion of African Americans in the population.)

  • Ha: p < 0.23 (The proportion of African Americans on juries in murder trials is less than the proportion of African Americans in the population.)

__________________________________________

Summary of Hypotheses

As a reminder, the null hypothesis is always a statement of equality (= or ≤ or ≥). The alternative hypothesis is always a statement of inequality (≠ or > or <). So hypotheses take the form:

  • H0: pp0 or pp0 or p = p0

  • Ha: p < p0 or p > p0 or pp0

We use p0 to represent the proportion from the null hypothesis.

NHST with proportions with the Binomial Distribution

Step 1: Determine the hypotheses for the proportion.

The hypotheses come from the research question and is stated in terms of population parameters. If we are interested in assessing proportions, then the hypotheses should be stated as detailed in this section.

  • H0: pp0 or pp0 or p = p0

  • Ha: p < p0 or p > p0 or pp0

Step 2: State your decision criterion (α).

Because the hypothesis test is based on probability, we need to state the level of acceptable type I error. This is usually set to 5% by tradition. This stated decision criterion is what we compare our test statistic (p-value from step 4) to in order to make a decision to reject or fail to reject the null hypothesis (step 5)

Step 3: Collect the data.

Ideally, we ethically select a random sample from the population. The data comes from this sample and we will be collected data regarding proportions. Often this involved looking at your sample and assessing if each person possesses some variable of interest (success) or not (failure) to find what proportion of the sample possesses the variable of interest.

Step 4: Assess the evidence by computing statistic(s).

Assume that the null hypothesis is true. Could the data come from the population described by the null hypothesis? Use simulation or a mathematical model to examine the results from random samples selected from the population described by the null hypothesis. Figure out if results similar to the data are likely or unlikely. Note that the wording “likely or unlikely” implies that this step requires some kind of probability calculation. This will be your computed p-value for your observed data from step 3. In order to compute the probability that your data would have occurred if the null were true, we will be computing a binomial probability if we are interested in proportions of successes. Please review chapter 10-E for how to compute binomial probabilities.

Step 5: State a conclusion.

We use what we find in the previous step to make a decision. This step requires us to think in the following way. Remember that we assume that the null hypothesis is true. Then one of two outcomes can occur:

  • One possibility is that results similar to the actual sample are extremely unlikely. This means that the data do not fit in with results from random samples selected from the population described by the null hypothesis. In this case, it is unlikely that the data came from this population, so we view this as strong evidence against the null hypothesis. Technically, if our computed p-value from step 4 is less than our stated alpha value from step 2, then we reject the null hypothesis in favor of the alternative hypothesis.

  • The other possibility is that results similar to the actual sample are fairly likely (not unusual). This means that the data fit in with typical results from random samples selected from the population described by the null hypothesis. Technically, if our computed p-value from step 4 is greater than our stated alpha value in step 2, then we fail to reject the null hypothesis. In this case, we do not have evidence against the null hypothesis, so we cannot reject it in favor of the alternative hypothesis.


References:

  1. https://courses.lumenlearning.com/wmopen-concepts-statistics/chapter/hypothesis-test-for-a-population-proportion-1-of-3/

CC LICENSED CONTENT, SHARED PREVIOUSLY