Module 2: Regression
गुरुर्ब्रह्मा गुरुर्विष्णु गुरुर्देवो महेश्वरा गुरुर्साक्षात परब्रह्म तस्मै श्री गुरवे नमः !
Module 2: Regression
Question 1 What are the requirements for independent and dependent variables in regression?
Independent variables can be either categorical or continuous. Dependent variables must be continuous.
Independent and dependent variables must be continuous.
Independent variables must be continuous. Dependent variables can be either categorical or continuous.
Independent and dependent variables can be either categorical or continuous.
Question 2 The key difference between simple and multiple regression is:
Multiple linear regression introduces polynomial features.
Simple linear regression compresses multidimensional space into one dimension.
To estimate a single dependent variable, simple regression uses one independent variable whereas multiple regression uses multiple.
Simple regression assumes a linear relationship between variables, whereas this assumption is not necessary for multiple regression.
Question 3 Recall that we tried to predict CO2 emission with car information. Say that now we can describe the relationship as: CO2_emission = 130 - 2.4*cylinders + 8.3*fuel_consumption
What is TRUE of this relationship?
When “cylinders” decreases by 1 while fuel_consumption remains constant, CO2_emission increases by 2.4 units.
When “cylinders” increases by 1 while fuel_consumption remains constant, CO2_emission increases by 2.4 units.
Since the coefficient for “fuel_consumption” is greater than that for “cylinders”, “fuel_consumption” has lower impact on CO2_emission.
When both “cylinders” and “fuel_consumption” increase by 1 unit, CO2_emission decreases.
Question 4 What could be the cause of a model yielding high training accuracy and low out-of-sample accuracy?
The model is training on the entire dataset, so it is underfitting.
The model is training on a small training set, so it is underfitting.
The model is training on a small training set, so it is overfitting.
When we perform multiple train/test splits using the same dataset, it will cause overfitting.
Question 5 Multiple Linear Regression is appropriate for:
Predicting the sales amount based on month.
Predicting tomorrow’s rainfall amount based on the wind speed and temperature.
Predicting whether a drug is effective for a patient based on her characteristics.