R Part 5
The total length of the videos in this section is approximately 43 minutes, but you will also spend time running code while completing this section.
You can also view all the videos in this section at the YouTube playlist linked here.
Please download the two code files below in order to follow along with the videos.
Graphically checking whether logs would help
![](https://www.google.com/images/icons/product/drive-32.png)
Question 1: Which assumptions can be made more true by logging one or both variables?
Show answer
linearity, equal variance, possibly normality
Voltage example
![](https://www.google.com/images/icons/product/drive-32.png)
Question 2: What happens if you have a bunch of dots along a curve, and you connect them in the wrong order?
Show answer
https://stackoverflow.com/questions/33700186/line-connecting-the-points-in-the-plot-function-in-r
(scroll down to "noise-less data")
Bands, and how to make and interpret residual v. fitted value plots
![](https://www.google.com/images/icons/product/drive-32.png)
Question 3: When are residual v. fitted value plots most useful?
Show answer
When you have multiple predictors and can't make one scatterplot to check the regression assumptions. No matter how many predictors you have, every data point will have one residual and one fitted value.
predict function for linear regression
![](https://www.google.com/images/icons/product/drive-32.png)
Question 4: Given a linear regression, if you don't specify a new data set to make predictions for, or if there is an error in your specification of a new data set, what does the predict function do?
Show answer
The predict function will output the fitted values for the data points used to fit the model.
predict function for trees and splines
![](https://www.google.com/images/icons/product/drive-32.png)
Question 5: If you have a spline instead of a tree or lm, what is the "new" argument called?
Show answer
It is called "x".
That's it.
During this tutorial you learned:
What a scatterplot of data will look like if the logging the outcome variable (Y) and/or predictor variable (X) would lead to a data set that better meets meet the assumptions of a linear regression, including linearity, equal variance, and/or normality
How to plot the fitted values for each value of x with points() and abline()
Two ways to plot the confidence intervals and prediction intervals for a linear regression model via calculation or the predict() function
How to plot residuals versus fitted values and how to use the residuals v fitted values plot to check assumptions of a linear regression model
How to use the predict() function and tips to troubleshoot common problems
Terms and concepts:
Log transformation, confidence interval, prediction interval, residuals versus fitted values plot
Functions in review:
exp(), log(), qt(), order(), predict()