Research

Working Papers:


Estimation and Inference with a (Nearly) Singular Jacobian”* with Adam McCloskey (Latest Version: June 18, 2016. Submitted)


    This paper develops extremum estimation and inference results for nonlinear models with very general forms of potential identification failure when the source of this identification failure is known. We examine models that may have a general deficient rank Jacobian in certain parts of the parameter space. When identification fails in one of these models, it becomes under-identified and the identification status of individual parameters is not generally straightforward to characterize.  We provide a systematic reparameterization procedure that leads to a reparameterized model with straightforward identification status.  Using this reparameterization, we determine the asymptotic behavior of standard extremum estimators and Wald statistics under a comprehensive class of parameter sequences characterizing the strength of identification of the model parameters, ranging from non-identification to strong identification. Using the asymptotic results, we propose hypothesis testing methods that make use of a standard Wald statistic and data-dependent critical values, leading to tests with correct asymptotic size regardless of identification strength and good power properties. Importantly, this allows one to directly conduct uniform inference on low-dimensional functions of the model parameters, including one-dimensional subvectors. The paper illustrates these results in three examples: a sample selection model, a triangular threshold crossing model and a collective model for household expenditures.

* This paper is motivated by my earlier working paper titled as 
“Identification and Inference in a Bivariate Probit Model With Weak Instruments” (2009) (Slides for the latter paper are available upon request.)




Multiple Treatments with Strategic Interaction” [Draft coming soon!]

        

    We develop an empirical framework in which we identify and estimate the effects of treatments on a particular outcome when the treatments are results of strategic interaction. We consider a model where agents play a discrete game with complete information whose equilibrium actions (i.e., binary treatments) determine an outcome of interest in a nonseparable model with endogeneity. Due to the multiplicity of equilibria in the first stage, the model as a whole is incomplete. Without imposing parametric restrictions or large support assumptions, we partially identify the average treatment effects (ATE's). Excluded variables and nonparametric shape restrictions on the outcome function and payoff functions enable us to derive tight bounds. With an additional assumption that excluded variables have a rectangular support, we derive sharp bounds. Point identification is achieved when excluded instruments have full support.



Nonparametric Estimation of Triangular Simultaneous Equations Models under Weak Identification” (Latest Version: October 6, 2015. Revise and Resubmit, The Journal of Econometrics)

        - Matlab Codes Download: ZIP file (Latest Version: March 2014.)


    This paper analyzes the problem of weak instruments on identification, estimation, and inference in a simple nonparametric model of a triangular system. The paper derives a necessary and sufficient rank condition for identification, based on which weak identification is established. Then nonparametric weak instruments are defined as a sequence of reduced form functions where the associated rank shrinks to zero. The problem of weak instruments is characterized to be similar to the ill-posed inverse problem, which motivates the introduction of a regularization scheme. The paper proposes a penalized series estimation method to alleviate the effects of weak instruments. The rate of convergence of the resulting estimator is given, and it is shown that weak instruments slow down the rate and penalization derives a faster rate. Consistency and asymptotic normality results are also derived. Monte Carlo results are presented, and an empirical example is given, where the effect of class size on test scores is estimated nonparametrically.




Identification in a Generalization of Bivariate Probit Models with Dummy Endogenous Regressors with Edward Vytlacil (Latest Version: August 25, 2016. Revise and Resubmit, The Journal of Econometrics)

    This paper provides identification results for a class of models specified by a triangular system of two equations with binary endogenous variables. The joint distribution of the latent error terms is specified through a parametric copula structure that satisfies a particular dependence ordering, while the marginal distributions are allowed to be arbitrary but known. This class of models is broad and includes bivariate probit models as a special case. The paper demonstrates that having an exclusion restriction is necessary and sufficient for global identification in a model without common exogenous covariates, where the excluded variable is allowed to be binary. Having an exclusion restriction is sufficient in models with common exogenous covariates that are present in both equations. The paper then extends the identification analyses to a model where the marginal distributions of the error terms are unknown.



CQIV: Stata Module to Perform Censored Quantile Instrumental Variable Regression with Victor Chernozhukov, Ivan Fernandez-Val, and Amanda Kowalski (Latest Version: June 2012.)
        - Stata Code Download: CQIV Stata ado fileCQIV Stata help file



How Would Information Disclosure Influence Organizations' Outbound Spam Volume? Evidence from a Field Experiment” with Shu He, Gene Moo Lee, and Andy Whinston (Latest Version: July 1, 2016. Forthcoming, Journal of Cybersecurity)


    Cyber-insecurity is a serious threat in the digital world. In the present paper, we argue that a suboptimal cybersecurity environment is partly due to organizations' underinvestment and a lack of suitable policies. The motivation for this paper stems from a related policy question: how to design policies for governments and other organizations that can ensure a sufficient level of cybersecurity. We address the question by exploring a policy devised to alleviate information asymmetry and to achieve transparency in cybersecurity information sharing practice. We introduce a cybersecurity evaluation agency along with regulations on information disclosure. To empirically evaluate the effectiveness of such an institution, we conduct a large-scale randomized field experiment on 7,919 U.S. organizations. Specifically, we generate organizations' security reports based on their outbound spam relative to the industry peers, then share the reports with the subjects in either private or public ways. Using models for heterogeneous treatment effects, we find evidence that the security information sharing combined with publicity treatment has significant effects on spam reduction for original large spammers. Moreover, significant peer effects are observed among industry peers after the experiment.



Work in Progress:


Sensitivity Analysis in Triangular Systems of Equations with Binary Endogenous Variables” with Sungwon Lee




Publications:


Invalidity of the Bootstrap and the m out of n Bootstrap for Confidence Interval Endpoints Defined by Moment Inequalities” with Donald Andrews, Econometrics Journal (2009), Volume 12, pp. S172–S199.


    This paper analyses the finite-sample and asymptotic properties of several bootstrap and m out of n bootstrap methods for constructing confidence interval (CI) endpoints in models defined by moment inequalities. In particular, we consider using these methods directly to construct CI endpoints. By considering two very simple models, the paper shows that neither the bootstrap nor the m out of n bootstrap is valid in finite samples or in a uniform asymptotic sense in general when applied directly to construct CI endpoints.
    In contrast, other results in the literature show that other ways of applying the bootstrap, m out of n bootstrap, and subsampling do lead to uniformly asymptotically valid confidence sets in moment inequality models. Thus, the uniform asymptotic validity of resampling methods in moment inequality models depends on the way in which the resampling methods are employed.