A Consistent Variance Estimator for 2SLS When Instruments Identify Different LATEs

Journal of Business and Economic Statistics (2018) 36:3, 400-410.

Under treatment effect heterogeneity, an instrument identifies the instrument-specific local average treatment effect (LATE). With multiple instruments, two-stage least squares (2SLS) estimand is a weighted average of different LATEs. What is often overlooked in the literature is that the postulated moment condition evaluated at the 2SLS estimand does not hold unless those LATEs are the same. If so, the conventional heteroskedasticity-robust variance estimator would be inconsistent, and 2SLS standard errors based on such estimators would be incorrect. I derive the correct asymptotic distribution and propose a consistent asymptotic variance estimator by using the result of Hall and Inoue (2003, Journal of Econometrics) on misspecified moment condition models. This can be used to correctly calculate the standard errors regardless of whether there is more than one LATE or not.


  • pdf, arXiv

  • Stata package (instructions) downloadable from the SSC Archive: ssc install mlr2sls

  • Replication of Tables IV, V, and VI of Angrist and Krueger (1991, QJE) using the correct standard error formula for 2SLS [data / Matlab code (updated on Nov 17, 2015)]

  • Replication of Table 7 Columns 4-6 of Angrist and Evans (1998, AER) using the correct standard error formula for 2SLS [data / Matlab code (updated on Nov 17, 2015)]

  • Replication of Table 7 of Thornton (2008, AER) using the correct standard error formula (also cluster-robust) for 2SLS [data / Matlab code (updated on Apr 15, 2016)]

  • Simulation code based on random subsamples of Angrist and Evans (1998) dataset [Matlab code (updated on Apr 15, 2016)]

  • Figure 1 from the paper shows the scatterplots of the p-value of the over-identifying restrictions test (J test) and the F statistics, versus the percentage difference between the "Multiple-LATE robust" and "conventional non-robust" standard errors. Even when the J test marginally does not reject the null hypothesis (p-value is just above 0.05 or 0.1) the difference can be as large as 40%, which means the user may find a false significance. The result is not sensitive to the strength of the instrument. BOTTOM LINE: When you use 2SLS with more instruments than the endogenous variables and interpret your point estimates as the Local Average Treatment Effect, then you should use the multiple-LATE robust standard error formula given in the paper.