ANOVA f-distribution

Here are some facts about the F distribution.

  1. The curve is not symmetrical but skewed to the right.

  2. There is a different curve for each set of dfs.

  3. The F statistic is greater than or equal to zero.

  4. As the degrees of freedom for the numerator and for the denominator get larger, the curve approximates the normal.

  5. Other uses for the F distribution include comparing two variances and two-way Analysis of Variance. Two-Way Analysis is beyond the scope of this chapter.


____________________________________________________________

EXAMPLE

MRSA, or Staphylococcus aureus, can cause a serious bacterial infections in hospital patients. This table and the graph show various colony counts from different patients who may or may not have MRSA.

Test whether the mean number of colonies are the same or are different. Construct the ANOVA table (by hand or by using a TI-83, 83+, or 84+ calculator), find the p-value, and state your conclusion. Use a 5% significance level.

While there are differences in the spreads between the groups, the differences do not appear to be big enough to cause concern.

We test for the equality of mean number of colonies:

H0 : μ1 = μ2 = μ3 = μ4 = μ5Ha: μiμj some ij

The one-way ANOVA table results are shown in below.



Distribution for the test: F4,10Probability Statement: p-value = P(F > 0.6099) = 0.6649.

Compare α and the p-value: α = 0.05, p-value = 0.669, α < p-value

Make a decision: Since α < p-value, we do not reject H0.

Conclusion: At the 5% significance level, there is insufficient evidence from these data that different levels of tryptone will cause a significant difference in the mean number of bacterial colonies formed. ____________________________________________________________

For more examples, click here.

Concept Review

The graph of the F distribution is always positive and skewed right, though the shape can be mounded or exponential depending on the combination of numerator and denominator degrees of freedom. The F statistic is the ratio of a measure of the variation in the group means to a similar measure of the variation within the groups. If the null hypothesis is correct, then the numerator should be small compared to the denominator. A small F statistic will result, and the area under the F curve to the right will be large, representing a large p-value. When the null hypothesis of equal group means is incorrect, then the numerator should be large compared to the denominator, giving a large F statistic and a small area (small p-value) to the right of the statistic under the F curve.

When the data have unequal group sizes (unbalanced data), then techniques need to be used for hand calculations. In the case of balanced data (the groups are the same size) however, simplified calculations based on group means and variances may be used. In practice, of course, software is usually employed in the analysis. As in any analysis, graphs of various sorts should be used in conjunction with numerical techniques. Always look at your data!

References:

  1. https://courses.lumenlearning.com/introstats1/chapter/facts-about-the-f-distribution/

CC LICENSED CONTENT, SHARED PREVIOUSLY