FDI-Growth

SORRY -- link given on blog is wrong -- the SECOND EXAMPLE is IPR-GROWTH

This page is to provide an exercise in applying the concepts discussed in Section 1: USE the concepts of the first section to provide a CRITIQUE of the SECOND model (not the VAR model)

IThe paper explores the relation between Economic Growth (EG) and FDI. After doing the usual integration cointegration analysis, author finds that EG and FDI are both stationary, and they are also co-integrated, so have a long term relationshipl. Then he runs a Vector Error Correction model following the Engle-Granger two step procedure -- first step is to save the errors from the cointegration equation, the second step is to run current EG and FDI on lagged errors. The results are displayed in the extract from the paper, which can be downloaded from the bottom of the page.

The SECOND equation estimated is the following:

Log( EG ) = a1 + a2 Log ( FDI ) + a3 LIT+ a4 log (FDI * LIT) + a5 Log ( Exports ) + Error (EQUATION)

Here LIT is the literacy rate and FDI * LIT is an interaction term,

After running this regression, which gets significant coefficients on each term, the author provides an interpretation in the usual way.

Please CRITIQUE this paper in the light of your understanding of the introduction of the DHSY paper -- as explained in detail

Questions and Answers are being posted on the BLOG

CRITIQUE:

Consider the paper: "I just ran two million regressions" by Sala-i-Martin. In this paper, he uses certain robustness techniques to assess which regressors are significant in a regression of GDP growth rates on a cluster of 59 variables. Basically the idea is to run a lot of different regressions. If one variable remains significant throughout regardless of which combination of variables is used, then it is assessed to be significant. This is a VARIANT of the EXTREME BOUNDS ANALYSIS first proposed by Edward Leamer. Sala-i-Martin concludes that 22 variables out of the 59 appear to be significant. In Extreme Bounds, a variable must be SIGNIFICANT in ALL regressions with ALL combinations of variables. Sala-i-Martin changes this to say that let us look at variables which are significant in 95% of the regressions. From this analysis, he concludes that 22 variables are significant. Clearly this is a lot more variables than are fitted in the EQUATION above, so if Sala i Martin is right than the above equation is MIS-SPECIFIED and suffers from many missing VARIABLES: which belong to the equation but have not been put in. We know from standard mis specification analysis that this will cause BIAS in estimates because of the missing variables.

Similarly, Hendry-Krolz start with a GUM -- General Unrestricted Model -- This is the biggest model containing all possible regressors available. Then the general to specific strategy is used to simplify the model down to the simplest possible model compatible with the data, by eliminating all variables, or combinations of variables which are not significant. NOTE THAT this step is NOT clearly specified -- there are many possible ways to reduce -- this is a WEAKNESS in the Hendry methodology -- ANYWAY, ignoring this problem for a moment, Hendry Krolzig come up with the following model -- Three of the Key variables are:

Number of years open economy, Equipment Investment, Fraction of the population that follows the Confucius Religion.

But the following variables also come out significant -- fraction of protestants, and a variable for political stability.

On the basis of these alternative models, we note that IF WE DONT USE A COMPARATIVE METHODOLOGY then there will be a PROLIFERATION of models -- everyone can come up with a new model, since he/she is not required to prove this his model is the best.

So Encompassing requires that we should show our model is better than those which have been already proposed. Given that earlier authors have already examined a lot of models and selected the best one from among millions, it seems very likely that both Salai Martin and Hendry Krolzig models are better than the one being tested and estimated by the author. At least, since the author is writing AFTER these two, it is his responsibility to show what is the defect in previous models, and to explain how his model fills the defect.

BUT MORE IMPORTANT is the MIS-SPECIFICATION problem. If one important variable is omitted from an equation, than other variables act as a PROXY for the omitted variable.

For example, if CP = a + b YP is the consumption function for Pakistan, suppose we take out YP and replace it by YB the GNP of Brazil. In the equation CP = a + b YB, the Brazil GNP will come out highly significant. Thus the person who runs this regression will conclude that YP the GNP of Brazil is an important determinant of the Consumption in Pakistan. The reason for this mistake is that the right variable YP has been OMITTED. Now YB gets significance because it PROXIES for the missing variable, taking its place. The estimated coefficient is really picking up the strength of the correlation between YP and YB, and NOT the significance of YB.

In exactly the same way, suppose that Henry Krolzig have estimated the right equation. THEN ALL Of the variables in the paper being critiqued are WRONG -- NONE of them belong in the equation exactly like YB does not belong in the consumption function of Pakistan. HOWEVER the author has EXCLUDED all of the right variables (whether these are the Henry-Krolzig variables or the Salai Martin Variables) because he has excluded the RIGHT variables, the wrong variables are PROXIES for the right variables. They are picking up the significance of the right variables as substitutes. Their coefficients and significance means nothing, because these are due to the severe biased induced by the strong misspecification, of having omitted ALL of the right variables from the regression