Organizing, representing, analysing and interpreting data and utilizing different statistical tools facilitates prediction and drawing of conclusions.
Different statistical techniques require justification and the identification of their limitations and validity.Recognise when and how data is skewed e.g. by looking at the mean and median.
I can explain how to find the measures of central tendency by hand and using technology
I know the three most used measures of spread and how to calculate them
I am able to use technology to calculate the standard deviation for a set of data and make inferences for the standard deviation for the population it may be a sample of.
I can find, analytically (not using 1-VAR on your GDC), the measures of central tendency (mean, mode, median, quartiles) and the measures of dispersion (range, IQR, standard deviation and variance) of a set of discrete (grouped or ungrouped) or continuous data
I c an explain how to use a cumulative frequency curve to find median, quartiles and percentiles of a set of data of a continuous variable
I can discuss cumulative frequency; cumulative frequency graphs; and their use to find median, quartiles, percentiles, range and interquartile range (IQR).
Getting a feel for data. The first thing a statistician/scientist/economist/researcher does after collecting data is trying to get a sense or feel for that data. What does it look like? Can we graph it? What’s the mean? What’s the lowest value? What is the mode? Only after we have a feel for the data we work with, can we start making inferences and predictions.
What do we need to know about our data and what techniques can we use to successfully interpret the information it gives us?
A set of data consists of 5 positive whole numbers. For this set:
The mean is 4
The mode is 3
The median is 3
How many of such datasets can you find?
You should pay special attention to the relation between the Median and the Mean. Sometimes they are the same, sometimes not. This has to do with the distribution of the data. Distribution can be shown through e.g. a Histogram, a Distribution function (curve) or a box plot. Symmetrically distributed data will have a mean and a median that are close or the same. If the data is skewed the mean and median will differ:
Positively skewed data has a Mean that is greater than the Median...
Simple measures of spread are the range and Inter Quartile Range (IQR). Variance and Standard Deviation are more advanced measures of spread. The Variance is the Standard Deviation squared. These are the formulae:
To work out the variance and standard deviation by hand, using a table seems to be the best way to do it:
For grouped data or data in a frequency table things get a bit more complicated. For grouped data you need to use the mid-values of the data intervals (e.g. the group 20-<30 has mid value 25), then you do the following:
As you can see, it is a lot of work finding the standard deviation by hand. You will probably appreciate this video which shows how to do it using the 1-VAR function of your calculator: