Working Papers:

Nonparametric Identification in Models for Dynamic Treatment Effects” [Draft coming soon!]

Multiple Treatments with Strategic Interaction” (Latest Version: January 3, 2018.)


    We develop an empirical framework in which we identify and estimate the effects of treatments on outcomes of interest when the treatments are results of strategic interaction (e.g., bargaining, oligopolistic entry, decisions in the presence of peer effects). We consider a model where agents play a discrete game with complete information whose equilibrium actions (i.e., binary treatments) determine a post-game outcome in a nonseparable model with endogeneity. Due to the simultaneity in the first stage, the model as a whole is incomplete and the selection process fails to exhibit the conventional monotonicity. Without imposing parametric restrictions or large support assumptions, this poses challenges in recovering treatment parameters. To address these challenges, we first analytically characterize regions that predict equilibria in the first-stage game with possibly more than two players, whereby we find a certain monotonic pattern of these regions. Based on this finding, we derive bounds on the average treatment effects (ATE's) under nonparametric shape restrictions and the existence of excluded variables. We also introduce and point identify a multi-treatment version of local average treatment effects (LATE's).

Estimation and Inference with a (Nearly) Singular Jacobian”* with Adam McCloskey (Latest Version: June 1, 2017. Revise and Resubmit, Quantitative Economics)

    This paper develops extremum estimation and inference results for nonlinear models with very general forms of potential identification failure when the source of this identification failure is known. We examine models that may have a general deficient rank Jacobian in certain parts of the parameter space. When identification fails in one of these models, it becomes under-identified and the identification status of individual parameters is not generally straightforward to characterize.  We provide a systematic reparameterization procedure that leads to a reparameterized model with straightforward identification status.  Using this reparameterization, we determine the asymptotic behavior of standard extremum estimators and Wald statistics under a comprehensive class of parameter sequences characterizing the strength of identification of the model parameters, ranging from non-identification to strong identification. Using the asymptotic results, we propose hypothesis testing methods that make use of a standard Wald statistic and data-dependent critical values, leading to tests with correct asymptotic size regardless of identification strength and good power properties. Importantly, this allows one to directly conduct uniform inference on low-dimensional functions of the model parameters, including one-dimensional subvectors. The paper illustrates these results in three examples: a sample selection model, a triangular threshold crossing model and a collective model for household expenditures.

* This paper is motivated by my earlier working paper titled as 
“Identification and Inference in a Bivariate Probit Model With Weak Instruments” (2009) (Slides for the latter paper are available upon request.)

Sensitivity Analysis in Triangular Systems of Equations with Binary Endogenous Variables” with Sungwon Lee (Latest Version: November 19, 2017.)

    This paper considers parametric/semiparametric estimation and inference in a class of bivariate threshold crossing models with dummy endogenous variables. We investigate the consequences of common practices employed by empirical researchers using this class of models, such as the specification of the joint distribution of the unobservables to be a bivariate normal distribution, resulting in a bivariate probit model. To address the problem of misspecification, we propose a semiparametric estimation framework with parametric copula and nonparametric marginal distributions. This specification is an attempt to ensure robustness while achieving point identification and efficient estimation. We establish asymptotic theory, including root-n normality, for the sieve maximum likelihood estimators that can be used to conduct inference on the individual structural parameters and the average treatment effects. Numerical studies suggest the sensitivity of parametric specification and the robustness of semiparametric estimation. This paper also shows that the absence of excluded instruments may result in the failure of identification, unlike what some practitioners believe.

Nonparametric Estimation of Triangular Simultaneous Equations Models under Weak Identification” (Latest Version: September 10, 2017. Submitted)

        - Supplemental Materials (Latest Version: September 10, 2017.)

        - Matlab Codes (Latest Version: February 2017.)

    This paper analyzes the problem of weak instruments on identification, estimation, and inference in a simple nonparametric model of a triangular system. The paper derives a necessary and sufficient rank condition for identification, based on which weak identification is established. Then nonparametric weak instruments are defined as a sequence of reduced form functions where the associated rank shrinks to zero. The problem of weak instruments is characterized to be similar to the ill-posed inverse problem, which motivates the introduction of a regularization scheme. The paper proposes a penalized series estimation method to alleviate the effects of weak instruments. The rate of convergence of the resulting estimator is given, and it is shown that weak instruments slow down the rate and penalization derives a faster rate. Consistency and asymptotic normality results are also derived. Monte Carlo results are presented, and an empirical example is given, where the effect of class size on test scores is estimated nonparametrically.

Censored quantile instrumental variable estimation with Stata with Victor Chernozhukov, Ivan Fernandez-Val, and Amanda Kowalski (Latest Version: January 2018. Submitted)
        - Stata Code Download: CQIV Stata ado fileCQIV Stata help file (RePEc link)

    Many applications involve a censored dependent variable and an endogenous in- dependent variable. Chernozhukov et al. (2015) introduced a censored quantile instrumental variable estimator (CQIV) for use in those applications, which has been applied by Kowalski (2016), among others. In this article, we introduce a Stata command, cqiv, that simplifes application of the CQIV estimator in Stata. We summarize the CQIV estimator and algorithm, we describe the use of the cqiv command, and we provide empirical examples.



Identification in a Generalization of Bivariate Probit Models with Dummy Endogenous Regressors with Edward Vytlacil, The Journal of Econometrics (2017), Volume 199, pp. 63-73.

    This paper provides identification results for a class of models specified by a triangular system of two equations with binary endogenous variables. The joint distribution of the latent error terms is specified through a parametric copula structure that satisfies a particular dependence ordering, while the marginal distributions are allowed to be arbitrary but known. This class of models is broad and includes bivariate probit models as a special case. The paper demonstrates that having an exclusion restriction is necessary and sufficient for global identification in a model without common exogenous covariates, where the excluded variable is allowed to be binary. Having an exclusion restriction is sufficient in models with common exogenous covariates that are present in both equations. The paper then extends the identification analyses to a model where the marginal distributions of the error terms are unknown.

How Would Information Disclosure Influence Organizations' Outbound Spam Volume? Evidence from a Field Experiment” with Shu He, Gene Moo Lee, and Andy Whinston, Journal of Cybersecurity (2016), Volume 2, pp. 99-118.

    Cyber-insecurity is a serious threat in the digital world. In the present paper, we argue that a suboptimal cybersecurity environment is partly due to organizations’ underinvestment on security and a lack of suitable policies. The motivation for this paper stems from a related policy question: how to design policies for governments and other organizations that can ensure a sufficient level of cybersecurity. We address the question by exploring a policy devised to alleviate information asymmetry and to achieve transparency in cybersecurity information sharing practice. We propose a cybersecurity evaluation agency along with regulations on information disclosure. To empirically evaluate the effectiveness of such an institution, we conduct a large-scale randomized field experiment on 7919 US organizations. Specifically, we generate organizations’ security reports based on their outbound spam relative to the industry peers, then share the reports with the subjects in either private or public ways. Using models for heterogeneous treatment effects and machine learning techniques, we find evidence from this experiment that the security information sharing combined with publicity treatment has significant effects on spam reduction for original large spammers. Moreover, significant peer effects are observed among industry peers after the experiment.

Invalidity of the Bootstrap and the m out of n Bootstrap for Confidence Interval Endpoints Defined by Moment Inequalities” with Donald Andrews, Econometrics Journal (2009), Volume 12, pp. S172–S199.

    This paper analyses the finite-sample and asymptotic properties of several bootstrap and m out of n bootstrap methods for constructing confidence interval (CI) endpoints in models defined by moment inequalities. In particular, we consider using these methods directly to construct CI endpoints. By considering two very simple models, the paper shows that neither the bootstrap nor the m out of n bootstrap is valid in finite samples or in a uniform asymptotic sense in general when applied directly to construct CI endpoints.
    In contrast, other results in the literature show that other ways of applying the bootstrap, m out of n bootstrap, and subsampling do lead to uniformly asymptotically valid confidence sets in moment inequality models. Thus, the uniform asymptotic validity of resampling methods in moment inequality models depends on the way in which the resampling methods are employed.