NHST for sample means of a continuous random variable with p-values

Full Hypothesis Test Examples


  • In a hypothesis test problem, you may see words such as “the level of significance is 1%.” The “1%” is the preconceived or preset α.

  • The statistician setting up the hypothesis test selects the value of α to use before collecting the sample data (step 2).

  • If no level of significance is given, a common standard to use is α = 0.05.

  • When you calculate the p-value and draw the picture, the p-value is the area in the left tail, the right tail, or split evenly between the two tails. For this reason, we call the hypothesis test left, right, or two tailed.

  • The alternative hypothesis, Ha, tells you if the test is left, right, or two-tailed. It is the key to conducting the appropriate test.

  • Ha never has a symbol that contains an equal sign.

  • Thinking about the meaning of the p-value: A data analyst (and anyone else) should have more confidence that he made the correct decision to reject the null hypothesis with a smaller p-value (for example, 0.001 as opposed to 0.04) even if using the 0.05 level for alpha. Similarly, for a large p-value such as 0.4, as opposed to a p-value of 0.056 (alpha = 0.05 is less than either number), a data analyst should have more confidence that she made the correct decision in not rejecting the null hypothesis. This makes the data analyst use judgment rather than mindlessly applying rules.

The following examples illustrate a left-, right-, and two-tailed test.

__________________________________________

EXAMPLE 1

Ho: μ ≥ 5, Ha: μ < 5

Test of a single population mean. Ha tells you the test is left-tailed. The picture of the p-value is as follows:


__________________________________________

EXAMPLE 2

H0: p ≤ 0.2  Ha: p > 0.2

This is a test of a single population proportion. Ha tells you the test is right-tailed. The picture of the p-value is as follows:


__________________________________________

EXAMPLE 3

H0: p = 50  Ha: p ≠ 50

This is a test of a single population mean. Ha tells you the test is two-tailed. The picture of the p-value is as follows.


__________________________________________

Full Hypothesis Test Example

Jeffrey, as an eight-year old, established a mean time of 16.43 seconds for swimming the 25-yard freestyle, with a standard deviation of 0.8 seconds. His dad, Frank, thought that Jeffrey could swim the 25-yard freestyle faster using goggles. Frank bought Jeffrey a new pair of expensive goggles and timed Jeffrey for 15 25-yard freestyle swims. For the 15 swims,Jeffrey’s mean time was 16 seconds. Frank thought that the goggles helped Jeffrey to swim faster than the 16.43 seconds. Conduct a hypothesis test using a preset α = 0.05. Assume that the swim times for the 25-yard freestyle are normal.

To start, let's set up the Hypothesis Test:

Since the problem is about a mean, this is a test of a single population mean.

Step 1: State the hypotheses

H0: μ ≥ 16.43  

There is no effect of the new googles on Jeffrey's swimming time.

Ha: μ < 16.43

There is an effect of the new googles on Jeffrey's swimming time: the new googles help Jeffry swim faster.

Step 2: State the decision criterion

Our decision criterion (alpha) will be .05.

The “<” tells you this is one-tailed test (it is a left-tailed test).

Step 3: Collect the data

Jeffrey swan the 25-yard freestyle with the new googles 15 times and had a mean time of 16 seconds.

Step 4: Compute the test statistic

Calculate the p-value using the normal distribution for a mean of 16.

First, compute the z-score for a mean of 16 with a sample of 15 swims:

Then find the p-value for a z-score of -2.08 from the unit normal table. We want to know how likely a mean of 16 or less (because our alternative hypothesis stated μ < 16.43) would be under the null distribution, thus we are looking for the probability of observing a z-score of -2.08 or less. This is represented by the area that is shaded in the below graph.

The p-value for a z-score of -2.08 is 0.0187.

This means that if the googles had no effect on Jeffrey's swimming, he had a 1.87% chance of swimming an average of 16 seconds or less on his 15 laps of the 25-yard freestyle. Because a 1.87% chance is small, the mean time of 16 seconds or less is unlikely to have happened randomly. It is a rare event.

Step 5: Make your conclusions and interpretations

Compare α and the p-value:

α = 0.05 p-value = 0.0187 α > p-value

Since our computed p-value of 0.0187 is less than our stated alpha of .05, our statistic is in our critical region and we reject the null hypothesis. At the 5% significance level, we conclude that we find a significant effect of the googles such that Jeffrey swims faster using the new goggles. The sample data show there is sufficient evidence that Jeffrey’s mean time to swim the 25-yard freestyle is less than 16.43 seconds.

The Type I and Type II errors for this problem are as follows:

The Type I error is to conclude that Jeffrey swims the 25-yard freestyle, on average, in less than 16.43 seconds when, in fact, he actually swims the 25-yard freestyle, on average, in 16.43 seconds. (Reject the null hypothesis when the null hypothesis is true.)

The Type II error is that there is not evidence to conclude that Jeffrey swims the 25-yard free-style, on average, in less than 16.43 seconds when, in fact, he actually does swim the 25-yard free-style, on average, in less than 16.43 seconds. (Do not reject the null hypothesis when the null hypothesis is false.)

Null Hypothesis Significance Test

Let's do another example. A college football coach thought that his players could bench press a mean weight of 275 pounds. It is known that the standard deviation is 55 pounds. Three of his players thought that the mean weight was more than that amount. They asked 30 of their teammates for their estimated maximum lift on the bench press exercise. The data ranged from 205 pounds to 385 pounds. The actual different weights were (frequencies are in parentheses) 205(3) 215(3)225(1) 241(2) 252(2) 265(2) 275(2) 313(2) 316(5) 338(2) 341(1) 345(2) 368(2) 385(1).

Conduct a null hypothesis significance test.

To start, we again beginning by setting up the Hypothesis Test:

Since the problem is about a mean, this is a test of a single population mean.

Step 1: State the hypotheses

H0: μ ≤ 275

The college football players are no different in their ability to bench press form the coaches expected population distribution.

Ha: μ > 275

The college football players are stronger in their ability to bench press form the coaches expected population distribution.

Step 2: State the decision criterion

Our decision criterion (alpha) will be .05.

The “>” tells you this is one-tailed test (it is a right-tailed test).

Step 3: Collect the data

They asked 30 of their teammates for their estimated maximum lift on the bench press exercise. The data ranged from 205 pounds to 385 pounds. The actual different weights were (frequencies are in parentheses) 205(3) 215(3)225(1) 241(2) 252(2) 265(2) 275(2) 313(2) 316(5) 338(2) 341(1) 345(2) 368(2) 385(1). This gives us a mean for n = 30 of 286.2.

Step 4: Compute the test statistic

Calculate the p-value using the normal distribution for a mean of 286.2 with 30 observations.

Compute the z-score for a mean of 286.2 with a sample of 30 football players:

Then find the p-value for a z-score of 1.12 from the unit normal table. We want to know how likely a mean of 286.2 or greater (because our alternative hypothesis stated μ > 275) would be under the null distribution, thus we are looking for the probability of observing a z-score of 1.12 or greater. This is represented by the area that is shaded in the below graph.

The p-value for a z-score of 1.12 is 0.1323.

This means that if H0 is true, then there is a 0.1331 probability (13.23%) that the football players can lift a mean weight of 286.2 pounds or more. Because a 13.23% chance is large enough, a mean weight lift of 286.2 pounds or more is not a rare event.

Step 5: Make your conclusions and interpretations

Compare α and the p-value:

α = 0.05 p-value = 0.1323 α < p-value

Since our computed p-value of 0.1323 is greater than our stated alpha of .05, our statistic is not in our critical region and we fail to reject the null hypothesis. At the 5% significance level, we conclude that we find no significant effect of the football players being able to lift more than 275 on average.

Concept Review for NHST with z-scores

The hypothesis test itself has an established process. This can be summarized as follows:

Step 1: Determine H0 and Ha. Remember, they are contradictory, mutually exclusive and exhaustive.

This involves identifying the best distribution for your research: Is your outcome variable (your independent or criterion variable) a continuous normal variable? Is it a discrete binomial distribution, etc.?

This informs your hypotheses to know if you are stating hypotheses about continuous variable means or discrete distribution proportions or means, etc.

Step 2: State your decision criterion and identify if you have a left, right, or two-tailed test.

Alpha is stated to be .05 by default.

Based on your alternative hypothesis, determine if you have a left (<), right (>), or two-tailed test (≠).

Step 3: Collect, and identify, your data.

Step 4: Compute your test statistic.

Compute the p-value: What is the probability that your observed data (step 3) would have occurred if the null hypothesis was true (step 1)?

For a normal distribution we use our z-scores for sample means and then use the unit normal table to find the p-value.

Step 5: Make your conclusions.

Compare α and the p-value. If the p-value is less than α we reject the null hypothesis and find support for the alternative. If the p-value is greater than the α we fail to reject the null hypothesis and do not find support for the alternative hypothesis. Interpret in terms of your stated hypotheses and variables.

References:

  1. https://courses.lumenlearning.com/introstats1/chapter/additional-information-and-full-hypothesis-test-examples/

CC LICENSED CONTENT, SHARED PREVIOUSLY