Data Set for points scored by basketball teams: 100,101,120,124,140,135, 120
1. Identify what part of the number should be the stem. This refers to the leftmost digit(s). If your data set contains mostly triple digit values then your stem would contain the first 2 digits. If it was 4 digit values, then the first three digits are your stem.
In our case our stems are 10, 12, 14,13
2. Write the stem values (non-repeating) from lowest to greatest in the stem column. If a stem doesn't exist in order (e.g. you have values from 10-19 and 30-39 but not any 20 values, you still have to write the "2" stem!).
Stem | Leaf
10 |
11 |
12 |
13 |
14 |
3. Write the leaf values (usually the ones digit) of your data set values. Doesn't have to be in order. DO include repeating values.
Stem | Leaf
10 | 0 1
11 |
12 | 0 4 0
13 | 5
14 | 0
4. Include a key and title.
Key: 14|0 = 140
Title: Points Scores in Basketball Games
They resemble bar graphs but are continuous and represent numerical data.
Histograms have intervals (referred to as bins or buckets) of data on the x-axis (horizontal). 60-65 is a bin and contains data points that fall within that range.
We'll use the cherry tree histogram as an example. (Although this can be applied to any diagram/plot such as stem and leaf or box plot). There are 4 components you have describe for distributions.
Identify the median or mean.
The median of the distribution for the trees is approximately 75 feet. The mean is approximately 77 feet.
These can be possible outliers or gaps.
There appears to be no possible outliers or gaps.
Always add "possible" before outliers as we don't know for sure if they are truly outliers.
How many peaks? Is it skewed? A data set is stretched out in a certain direction. What's its symmetry.
This histogram is skewed to the right.
The histogram on the left is skewed to the right. You can tell which direction the histogram is skewed based on its "tail" or how its stretched out. If the right side is stretched out more, then it's skewed right. If the left side is stretched out, then it is skewed left.
Our cherry tree histogram doesn't appear to be skewed. It has one peak between 75 and 80 feet. It is roughly symmetric.
This is the variability of the data. You can describe the range of the data.
The spread of the cherry tree data is from 60 to 90 feet.
The y-axis represents the percentage. So in the below ogive (2nd chart in the middle), 80% of values falls between 10 and 30. This doesn't mean that 80% of values are at 30! You can think of it as percentiles: 80th percentile is 30 (thus 80% of values are below or at 30).
Middle chart is an ogive chart (others aren't).