Learning Targets
Comparing two individuals is fairly straightforward. The question "Which dog is taller?" can be answered by measuring the heights of two dogs and comparing them directly. Comparing two groups can be more challenging. What does it mean for the basketball team to generally be taller than the soccer team?
To compare two groups, we use the distribution of values for the two groups. Most importantly, a measure of center (usually mean or median) and its associated measure of variability (usually mean absolute deviation or interquartile range) can help determine the differences between groups.
For example, if the average height of pugs in a dog show is 11 inches, and the average height of the beagles in the dog show is 15 inches, it seems that the beagles are generally taller. On the other hand, if the MAD is 3 inches, it would not be unreasonable to find a beagle that is 11 inches tall or a pug that is 14 inches tall. Therefore the heights of the two dog breeds may not be very different from one another.
interquartile range (IQR)
The interquartile range is one way to measure how spread out a data set is. We sometimes call this the IQR. To find the interquartile range we subtract the first quartile from the third quartile. For example, the IQR of this data set is 20 because 50 - 30 = 20.
mean
The mean is one way to measure the center of a data set. We can think of it as a balance point. For example, for the data set 7, 9, 12, 13, 14, the mean is 11.
To find the mean, add up all the numbers in the data set. Then, divide by how many numbers there are. 7 + 9+ 12 + 13 + 14 = 55 and 55 divided by 5 = 11.
mean absolute deviation (MAD)
The mean absolute deviation is one way to measure how spread out a data set is. Sometimes we call this the MAD. For example, for the data set 7, 9, 12, 13, 14, the MAD is 2.4. This tells us that these travel times are typically 2.4 minutes away from the mean, which is 11.
To find the MAD, add up the distance between each data point and the mean. Then, divide by how many numbers there are. 4 + 2+ 1 + 2 + 3 = 12 and 12 divided by 5 = 2.4.
median
The median is one way to measure the center of a data set. It is the middle number when the data set is listed in order.
For the data set 7, 9, 12, 13, 14, the median is 12.
For the data set 3, 5, 6, 8, 11, 12, there are two numbers in the middle. The median is the average of these two numbers. 6 + 8 = 14 and 14 divided by 2 = 7.
What do you notice? What do you wonder?
Olympic Volleyball player Kim Glass towers over tiny Olympic Gymnast Shawn Johnson!
Here are three dot plots that represent the lengths, in minutes, of songs on different albums.
As a general rule, we will consider it a large difference between the data sets if the difference in means is more than twice the mean absolute deviation. If the mean absolute deviation is different for each group, use the larger one for this calculation.