Non-Parametric Models: Splines

The total length of the videos in this section is approximately 6 minutes. You will also spend time answering short questions while completing this section.

You can also view all the videos in this section at the YouTube playlist linked here.

This lecture marks a transition. We have been focusing on methods for comparing two groups (randomization/permutation tests, t-tests). Now we are moving on to methods for examining the relationships between multiple variables, with one or more predictors and an outcome variable. As before, there will be non-parametric methods and parametric methods. Splines are non-parametric.

Introduction to splines

NonParametricModels.1.IntroSplines.mp4

Question 1: Suppose that you connect all of the dots in the video's example. You use the resulting connected line as your model for predicting Y given X. So, for example, if your original data set included data (10,12), then the model predicts that data points with X=10 will have Y exactly equal to 12. Is this a good model?

Show answer

No. Perhaps (10,12) was an outlier or didn't fit the overall pattern of the data. Even if (10,12) does fit in with the rest of the data, you should take advantage of the information in the overall pattern of the data. By connecting all the dots, we're "overfitting," which means that we're assuming future data points will be exactly like previous ones, rather than using the overall pattern of the data to make predictions.

Details

NonParametricModels.2.SplineDetails.mp4

Question 2: When might you use a spline? Check all that apply.

Show answer

When you want to make predictions. Splines don't generate p-values or estimate parameters.

Conclusion

NonParametricModels.3.Splines2.mp4

Question 3: Try to explain the phrase "model for the mean."

Show answer

Units with the same value of X will have various values of Y. We predict the mean of the values of Y for units who share a value of X. Regression/ANOVA is a parametric model for the mean, as we will see.

Short and sweet.

During this tutorial you learned:


Terms and concepts:

Spline, overfitting, model for the meanÂ