Intervals for regression
The total length of the videos in this section is approximately 38 minutes. You will also spend time answering short questions while completing this section.
You can also view all the videos in this section at the YouTube playlist linked here.
A simpler case
Question 1: You are a picture framer. Your job is to be prepared to frame most of the pictures that people bring into your picture framing shop. When you are ordering your supplies, which of the following would be more helpful? Pretend that all the pictures are squares, so that we don't have to worry about both length and width.
A 95% confidence interval for the mean size of a picture among your customers
A 95% prediction interval for the size of a picture among your customers
Show answer
A prediction interval would be much more helpful. A confidence interval provide information about the average picture you'll need to frame, but a prediction interval provides the range of sizes you can expect, so that you will be prepared for 95% of the customers' framing needs. Knowing that we are 95% sure that the interval (7.5 inches squared, 8.5 inches squared) includes the mean picture size is much less useful than knowing that 95% of the pictures will have sizes within (3 inches squared, 13 inches squared). You should make sure to have frames that span that size range in your shop.
Two types of confidence intervals for regression
Question 2: For what value of X_0 will the confidence interval for Mean(Y|X=X_0) be narrowest?
Show answer
The confidence interval for the mean of Y given X will be narrowest when X_0 is equal to the sample mean of X. First, a mathematical argument: if you look at the expression in the video for the width of this confidence interval, you can find "(X_0 - Xbar)^2" in a numerator. The width of the interval is narrowest when this term is at its minimum which is when that difference is zero. Next, an intuitive argument: we do not have a lot of data at the extreme values X, and even a small change in slope could lead to a very different prediction for a data point with very extreme X. However, we know that a linear regression line always goes through the intersection of the sample means, so our prediction of the mean of Y when X is equal to Xbar will be equal to the sample mean of Y and has nothing to do with our uncertainty about the slope.
Transitioning to prediction intervals
(The first several seconds are just talking, no writing.)
Question 3: What do you notice about the drawing?
Show answer
Once again, the intervals are narrower when X is near the sample mean of X and wider when X is extreme.
Prediction intervals for linear regression
Question 4: When we connect the dots between the intervals for several values of X, we call the connect-the-dots "bands." Why are prediction bands wider than confidence bands?
Show answer
95% prediction bands are meant to include 95% of the actual data points, whereas 95% confidence bands are meant to be a region such that there is a 95% chance that the line falls within.
All done.