Often averages are used to summarise all the values in a set of data.
For example: What would you do to answer the question - "On average for how many hours do you exercise each week?" The person that asked he question isn't looking to be given a bunch of numbers of the last 10 weeks, rather they just want one.
There are three different statistics here we can use the measure an average amount.
These are:
Mean - this is all of the values added together, divided by the number of values there is.
Median - this the middle number when sorted in order
Mode - this is the most common number to occur
Add all the values
5+7+2+4+4+6 = 28
Divide the total by the number of values
In this case there are 7 values so 28 ÷ 7 = 4
The mean is 4
For this we look for the most common value which in this case is 4
So the mode is 4
Sometimes there can be two modes or no mode!
First we need to sort the values in order
2, 4, 4, 5, 6, 7, 9
Next we cross one off each side until we reach the middle
2, 4, 4, 5, 6, 7, 9
This leaves 5. Therefore the median is 5.
However if there are two numbers left we need to take the mean of these numbers
2, 4, 4, 5, 6, 6, 7, 9
5+6=11
11 ÷ 2 = 5.5 (So in this case the mean is 5.5)
So we have three different ways of finding an average. How do we know what one to choose?
Well for each of them they definitely have there positives and negatives.
+ Really easily to find
- Does not use all of the data
+ Fairly easy to calculate, uses all of the data to produce value
- Is affected by extreme values/outliers
+ Uses all of the data to find where the middle value lies, and is not affected much my extreme values
- Can be time consuming to sort the values, and at times easy to make a mistake
Quartiles are numbers that split the data into quarters. To split the data into four parts there must be three divisions:
The median - half the data above that value and the other half below
The Lower Quartile (LQ) - this is the median of the lower half
The Upper Quartile (UQ) - this is the median of the upper half
For the above data:
The LQ is 3, the median is 5.5, and the UQ is 7
Spread is how we measure how apart the data is. There are two measures of spread:
Range= Maximum - Minimum
Interquartile Range = Upper Quartile - Lower Quartile
Out of the two methods the IQR is more reliable as it is less effected by extreme values. Whereas one extreme value could make the range really large when actually the rest of the values are grouped closely to one another.