In the past we have focused mainly on data from a single variable.
Using the data from a group, we were able to make a very rough estimate of the value of a member of the population using an average.
Averages:
Mean = 166.8 cm
Median = 168.5 cm
Mode = 170 cm
In this topic we turn our attention to data where we have two pieces of information about each unit we measure. A unit may be an individual or an object.
Univariate versus Bivariate Data
Skip to 1:21 and watch to 4:20
Use summaries to describe the data:
Mean, median, mode
Interquartile range, standard deviation
For example, we can say that:
the typical height for this group of students is around 168cm
half of all these students have a height between 164 and 172 cm
This information can be summarized succinctly with a BOX and WHISKER plot.
Here we use models to describe the relationship between two attributes.
Types of models:
linear
polynomial
exponential, etc
Relationships are best visualized with SCATTER GRAPHS.
Variables and Attributes
The information we gather about our units are called attributes.
The attributes that we gather about each unit are more commonly called variables.
We will be investigating the relationship between two variables.
For a person these variables may be their height, income, gender, etc.
For a tree it may be its girth or type.
Research often produces data with two pieces of information about each object or individual. Data which has two variables to be studied is called bivariate data.
Examples could be:
the age of kauri trees and their diameter
house prices and population density
radioactivity in the ground water and distance from Fukushima nuclear power plant.
A Scatterplot is a tool for displaying two pieces of information about an individual on one graph.
Somewhere in your past you might have investigated the relationship between height and foot length of the people in your class. You probably drew a scatter plot similar to the one below. You may have concluded that students with longer feet tend to be taller.
In this unit we will be:
looking further into relationships between variables.
using a relationship to make predictions.
using these relationships and predictions to answer questions about real problems.