Sampling Variability and Measures of Dispersion

The total length of the videos in this section is approximately 44 minutes, but you will also spend time answering short questions while completing this section.

You can also view all the videos in this section at the YouTube playlist linked here.

Polls

SamplingVarAndMeasuresOfDis.1.SampleSize.mp4

Question 1: How many people should you poll in California (population approx 38 million) to generate a sample as precise as your estimate based on 500 people from Oklahoma (population approx 3.8 million?)

Show answer

500. This is explained in the next video.

It's like a bowl of soup

SamplingVarAndMeasuresOfDis.2.More on Sample Sizes.mp4

Question 2: Which of the following affect your ability to assess the saltiness of the soup? Check all that apply.

Show answer

The first three options, but not the fourth. 

Question 3: Which of the following affect your ability to precisely estimate the election outcome based on a poll? Check all that apply.

Show answer

The first, second, and third, but not the fourth.

In these videos, we are using the word "precision" as a substitute for the words "uncertainty," "error," or "variance." No matter what we call it, we are discussing how much an estimate varies over different possible samples that you might draw from the population.

Question 4: When you read about an election poll, approximately how many people do you think were typically polled?

Show answer

It's almost always about 1000. That's not many, but it's enough if the sample really is representative. Election polls are traditionally telephone polls, and it is increasingly challenging to avoid selection bias in telephone polls, since fewer people have landlines, among other problems.

I expect that the discussion of measures of center is a review for many. Please skim that video as appropriate, but don't skip the videos on dispersion or variance, even though this is likely not your first statistics course.

Centers

SamplingVarAndMeasuresOfDis.3.Centers of Distributions.mp4

Question 5: Suppose that we ask each student in a physical education class to run a mile, and we record the times. We will likely obtain a right-skewed distribution, with some students finishing the mile at about the same time as each other and some students taking much longer, with their times spread out on the right of the distribution. Which will be larger, the mean or the median mile time?

Show answer

The mean. Means are influenced by extreme values much more than medians. When a distribution is right-skewed, the mean is generally higher than the median.

Dispersion

SamplingVarAndMeasuresOfDis.4.Measures of Dispersion.mp4

Question 6: If we have a right-skewed distribution, such as obtained from a class of students running a mile, will the 25th percentile or the 75th percentile be closer to the median?

Show answer

25th. If the lower times are clumped together, the students running between the 25th and 50th (median) percentiles may have very similar times, in a narrow range. If the longer times are spread out, as in a right-skewed distribution, the students taking longer than the median time may have a wide range of times.

Mean/Median Absolute Deviation

SamplingVarAndMeasuresOfDis.5.MAD Dispersion.mp4

Question 7: Calculate the median absolute deviation of the following values: 3,5,8

Show answer

2. The median is 5. The absolute deviations are 2, 0, and 3. The median of the absolute deviations is 2.

Variance, mean, and notation

SamplingVarAndMeasuresOfDis.6.Notation for Mean & Variance in Dispersion.mp4

Question 8: How is the variance related to the standard deviation?

Show answer

Variance = (Standard Deviation) * (Standard Deviation)

Question 9: What is the variance of the numbers 3, 5, and 10?

Show answer

8.67

Variance and the normal distribution

SamplingVarAndMeasuresOfDis.7.Normal Distribution in Dispersion.mp4

Question 10: If Y follows a normal distribution with mean 10 and variance 4, 95% of values lie between two numbers. What is the lower number?

Show answer

6, because 10 - 2 * sqrt(4) = 10 - 4 = 6.

That's it!

During this tutorial you learned:


Terms and concepts:

Center, mean, median, mode, skewed distribution, standard deviation, variance, range, min/max, quartiles/percentile, interquartile range (IQR), mean/median absolute deviation, distribution, and normal distribution