L05 EDA
Looking at Data: Exploratory Data Analysis (EDA)
The lecture shows that it is essential to look at data before doing regression analysis. Exactly the same regression output can come from different types of data sets and have entirely different meanings. In particular
FUNCTIONAL FORM (linear or not)
OUTLIERS
CLUSTERS OF DATA (combing leads to misleading results)
MISSING IMPORTANT OR RELEVANT VARIABLES
All of these problems can lead to HIGHLY MISLEADING regression results.
Lecture ends by examining structural change in the INDY 500 Winning Speed DATA. HOMEWORK is to look for linearity and structural change in GNP series. If OUTLIERS can be found, that is also something to check. In connection with structural change, the following two articles would be very useful to look at:
Tests for Structural Change, Aggregation, and Homogeneity
Changing Point and Parameter Instability with Heteroskedastic Models
Video Lecture (my website) - Lecture on need for graphical/visual understanding of data
L3 Necessity of EDA prior to Regression - Discusses thesis of Uzma Bashir, Determinants of Corporate Philanthropy. Shows how clusters mislead regressions