Plan
(Comparison)
(Comparison)
What you will write here will depend on whether you collected the data yourself (Primary Data) or you are using data someone else collected (Secondary Data).
You collected the data yourself.
List the names of all the variables that you measured or collected.
Include units.
Describe what the variable is.
egs:
Distance : This is the distance the gumboot was thrown in metres.
Height: This is the height of the student in cm.
Hand: This the hand the student threw with [Left(L) or Right(R)]
You were given the data.
Copy the exact names of the variables you are using from the given dataset.
Include units of any quantitative variable.
Copy the description of the variable exactly from the information on the given dataset.
Write down the source of the dataset.
Write down the sample size taken from the dataset.
Write down the sampling method used.
eg:
Datasource:
Crash Statistics 2010 - 2020 by NZ Transport Agency
Variables:
Distance from home: Estimated distance away from home when crash occurred (in kilometres).
Age: Age in years at time of crash (Years).
Licence: This is the type of licence of the driver (Learners [L], Restricted [R] or Full [F]).
Sample Size:
100 per group
Sample Method:
Simple Random Sampling
This is an important part of your report.
Here you will describe how accurate your data is by discussing how you:
managed any sources of variation (Primary Data)
or
assume any sources of variation were managed when the data was being collected (Secondary Data).
In each case you will need to discuss two different types of variation.
You need to identify and describe at least 2 different sources of variation that could have affected the data collection.
AND
State how you will manage these sources (or how you assume they have been managed if you are using secondary data) - Important!
What are the two variables that you are investigating?
One is a categorical (qualitative) variable and the other is a numeric (quantitative) variable.
Qualitative Variable
- used to determine the two groups you are comparing. eg male /female, Blue eyes/Brown eyes, etc
Quantitative Variable
- is what you are comparing. eg height, heart rate, throwing distance, etc
What is the unit of the quantitative variable?
egs: height in cm, heart rate in beats per minute, throwing distance in m, etc
How are you going to take any measurements?
What sources of variation are likely to affect your data?
How are you going to minimize the effect of these sources?
Same as for Primary Data and you should also consider the following questions:
What is the data source of the data you are going to use?
Who collected the data?
Why did they collect that data?
When was the data collected?
How many records are in this dataset?
What sources of variation might have affected the data collection?
How would these sources be minimized?
Could there have been some sources of variation that could not have been identified and minimized?