Connecting splines, trees, and linear regression

The total length of the videos in this section is approximately 74 minutes. Feel free to do this in multiple sittings! You will also spend time answering short questions while completing this section.

You can also view all the videos in this section at the YouTube playlist linked here.

SplinesTrees&LinearRegression.1Intro1.mp4

Question 1: What is the remaining term in the equation at the end of the video?

See answer in the beginning of the next video.


One continuous predictor


SplinesTrees&LinearRegression.2.Intro2.mp4

Question 2: Describe a relationship between X and the mean of Y that would be best modeled with a spline, rather than a regression tree or a linear regression equation. You might try to sketch it in your notes.

An example is shown at the beginning of the next video.


One continuous predictor, again


SplinesTrees&LinearRegression.3.Intro3.mp4

Question 3: Approximately what is the mean of Y, when X is equal to zero, in this example?

The answer is shown at the beginning of the next video.


Multiple predictors


SplinesTrees&LinearRegression.4.Intro4.mp4

Question 4: What feature of this example allows the tree to model the data well?

Show answer

The tree can model this data well because the mean of Y appears to be the same for various ranges of X. For either W=0 or W=1, the relationship between X and the mean of Y is segmented rather than continuous (or at least, it's a continuous horizontal line).

Multiple predictors, again


SplinesTrees&LinearRegression.5.MultiplePredictors.mp4

Question 5: Take some time on this one. If we were live in class, you would work on this for several minutes in a group.

Complete this equation in a way that describes the data well, using specific numbers:

Mean(Y|X, W) = 

Hint: Try writing down equations relation the mean of Y to X for East and West separately, then write down one overall equation that includes both East and West.


The answer is shown and explained in the next video.


One overall equation for multiple predictors


SplinesTrees&Linear Regression.6.MultiplePredictors2.mp4

Question 6: Do you happen to know what we call the last term in the equation shown in the video? If you have seen multiple regression before, or ANOVA with multiple predictors, I bet this is a familiar concept. It's okay if not, but ponder for a moment.

Show answer

It's an interaction! As described in the next video, an interaction term allows the relationship between the outcome (Y) and one predictor (X) to depend on another predictor (W). If the coefficient of the interaction term is zero, then we have no interaction. For that reason, linear regression with no interaction is a special case of linear regression with a non-zero interaction.

Understanding that final term


SplinesTrees&LinearRegression.7.FinalTerm.mp4

Question 7: Does a regression tree allow interactions?

Show answer

Yes, always, by default! Note that after a tree splits on one variable, the remaining splits on the left side may be completely different from the remaining splits on the right. In other words, when predicting college GPA, if the tree first splits based on major, then it might further split based on high school GPA for science majors, but it might not split further based on high school GPA for social science majors. I am inventing that example and pattern. But the point is the relationship between one predictor (HS GPA) and an outcome (college GPA) is allowed to be totally different for various values of another predictor (major). This is an advantage of a non-parametric model like a tree over a parametric model like a linear regression: running a regression tree algorithm does not require you to choose an equation that describes all possible interactions, but you do have to decide which terms to include in a parametric linear regression model. (Obviously, there are also disadvantages to using a tree: for example, if there is really is a linear relationship between X and Y, the tree will try to approximate it with a bunch of horizontal line segments.)

And that's all.

During this tutorial you learned:


Terms and concepts:

spline, regression tree, linear, model for the mean, βo (‘beta-naught’), β1, indicator variable, reparameterization, discretizing continuous variables, interaction term


Conditional operator:

(Y | X) is read as “Y given X” or “Y conditional on X”