Working papers

(Honourable Mention for "Best PhD Paper" Award, IAAE Conference 2022)

This paper studies the problem of estimating individualized treatment rules when treatment effects are partially identified, as it is often the case with observational data. We first study the population problem of assigning treatment under partial identification and derive the population optimal policies using classic optimality criteria for decision under ambiguity. We then propose an algorithm for computation of the estimated optimal treatment policy and provide statistical guarantees for its convergence to the population counterpart. Our estimation procedure leverages recent advances in the orthogonal machine learning literature, while our theoretical results account for the presence of non-differentiabilities in the problem. The proposed methods are illustrated using data from the Job Partnership Training Act study. 


This paper suggests a new instrumental variable (IV) estimator for non-linear models with endogenous covariates. We choose the estimate of the regression coefficients on the endogenous variables based on the following criterion: If the IVs are added as “auxiliary regressors” to the model, then we want their estimated coefficients (obtained as maximum likelihood estimates, while keeping the coefficients on the endogenous variables fixed) to be equal to zero or close to zero. This method is quite intuitive: It formalizes the idea that the IVs should be “excluded variables” that do not have any direct explanatory power for the outcome. After formally introducing this “auxiliary IV” estimator we then explore its properties through asymptotic theory and Monte Carlo simulations, and we also apply it to two empirical illustrations. Most of the paper focuses on binary choice models, but our results also extend to other non-linear models.


It is common practice in empirical work to employ cluster-robust standard errors when using the linear regression model to estimate some structural/causal effect of interest. Researchers also often include a large set of regressors in their model specification in order to control for observed and unobserved confounders. In this paper we develop inference methods for linear regression models with many controls and clustering. We show that inference based on the usual cluster-robust standard errors by Liang and Zeger (1986) is invalid in general when the number of controls is a nonvanishing fraction of the sample size. We then propose a new clustered standard errors formula that is robust to the inclusion of many controls and allows to carry out valid inference in a variety of high-dimensional linear regression models, including fixed effects panel data models and the semiparametric partially linear model. Monte Carlo evidence supports our theoretical results and shows that our proposed variance estimator performs well in finite samples. The proposed method is also illustrated with an empirical application that re-visits Donohue III and Levitt’s (2001) study of the impact of abortion on crime.

Work in progress