Autocorrelation

MISUNDERSTANDING: Autocorrelation is a PROBLEM,

1. we need to remove it -- The Cochrane-Orcutt method is one such technique.

2. If we cant remove it, we should use GLS instead of OLS, since OLS is inefficient.

BOTH OF THESE ARE WRONG approaches to autocorrelation

Consider the model Y(t) = a + b X(t) +e(t) -- suppose e(t) = r e(t-1) + u(t) where u does not have autocorrelation

Note that e(t-1) = Y(t-1) - a - b X(t-1). It follows that e(t) = r [ Y(t-1) - a - b X(t-1) ] + u(t). Substitute into the original equation to get:

Y(t) = a + b X(t) + r [ Y(t-1) - a - b X(t-1) ] + u(t) -- this equation does not have autocorrelated errors since u(t) is not autocorrelated.

REWRITE to get:

Y(t) = a(1-r) + b X(t) + r Y(t-1) - br X(t-1) + u(t) = a* + b X(t) + c* Y(t-1) + d* X(t-1) + u(t)

By adding lags of ALL variables, we have removed the autocorrelation.

THE ABOVE is the technical and mathematical explanation, but INTUITIVE UNDERSTANDING of this issue is equally important.

STATIC MODEL: Every period is independent of every other period. Equilibrium occurs in one period and each period is self contained, what happens in one period does not affect what happens in the next

DYNAMIC MODEL: Things which happen in period T affect what happens in period T+1.

AUTOCORRELATION is a MESSAGE from the data. The data is telling you that the errors in the last period HAD an effect on the current period. That means that your idea that the model is STATIC is wrong. You have a DYNAMIC MODEL. You have MIS-SPECIFIED the model, and mistakenly assumed it to be static,when it was dynamic.

AUTOCORRELATION is a VERY SPECIAL type of dynamic model.

GENERAL DYNAMIC MODEL: Y(t) = a + b X(t) + c Y(t-1) + d X(t-1) + u(t) -- the general model has c,d unrestricted.

SPECIAL CASE: If d= - bc, then we can set c=r and RE-WRITE this as a model with AR-1 error.

ONCE we know that there are dynamic effects, then WE SHOULD NOT assume that d = - bc (this is called a COMMON FACTOR restriction, for complicated reasons I cannot explain here). Instead we should estimate the GENERAL model with no restrictions, This is an application of the principle of General to Simple modelling which says that we should start with General models and Simplify them if needed. That is, we can TEST the common factor restriction, and if it holds, then we can simplify the model to an autoregressive model. If it does not hold then we should keep the general model.

PRACTICAL CONCLUSION: One should not test for autocorrelation. If we are worried about the possiblilty of a DYNAMIC model, then we should put in ONE lag of ALL variables (dependent AND independent) into the regressors. Test ALL of these for joint significance using the standard F test. If we cannot reject the null hypothesis that ALL of the coefficient are zero, this means that the model is NOT DYNAMIC -- at least that the dynamic effects are NOT SIGNIFICANT. Now we can drop all of the lagged variables and go back to the STATIC model with confidence that there will be no autocorrelation. Because autocorrelations is a SPECIAL type of dynamic model, but we have tested ALL possible dynamic effects

FOR SIMPLICITY, I have discussed situation with ONE regressor and ONE lag. The same principles holds for MULTIPLE regressors and MULTIPLE lags. With more than one regressor, we have to include lags of ALL regressors. Similarly for second order DYNAMIC effects, we should include 2nd order lags. With quarterly data, one should put in at least five lags of all variables in testing for dynamic effects. With monthly data 13 lags are recommended. If due to data shortage or some other reason it is not possible to put in so many lagged variables, there are some tricks and short cuts which can be used, but these depend very much on the particular situation at hand, and cannot be described in general.

BACK TO: COMMON ECONOMETRIC MISTAKES