1. "Inference under Covariate-Adaptive Randomization with Imperfect Compliance," with Federico Bugni (2023), Journal of Econometrics, vol. 237 (1). [Paper][Arxiv]
2. "Uniform Nonparametric Inference for Time Series using Stata," with Jia Li and Zhipeng Liao (2020), The Stata Journal, vol. 20, pp. 706-720. [Paper]
1. "Endogenous Interference in Randomized Experiments." [Link] [Arxiv]
Abstract: This paper investigates the identification and inference of treatment effects in randomized controlled trials with social interactions. Two key network features introduce endogeneity: (1) latent variables influencing both network formation and outcomes, and (2) treatment-induced changes to the network structure that mediate treatment effects. I first define parameters in a posttreatment network framework, distinguishing direct effects from indirect effects mediated by network changes, and provide a causal interpretation of coefficients in a linear outcome model. To address endogeneity, I propose a shift-share instrument variable strategy and establish consistency and asymptotic normality of the IV estimator in relatively sparse networks. For denser networks, I introduce a denoised SSIV estimator based on eigendecomposition to restore consistency. Finally, I revisit Prina (2015) as an empirical illustration, showing that treatment can influence outcomes both directly and through network structure changes.
2. "Causal Inference in Network Experiments: Regression-based Analysis and Design-based Properties," with Peng Ding. Revise & Resubmit at Journal of Econometrics. [Arxiv]
Abstract: Network experiments are powerful tools for studying spillover effects, which avoid endogeneity by randomly assigning treatments to units over networks. However, it is non-trivial to analyze network experiments properly without imposing strong modeling assumptions. We show that regression-based point estimators and standard errors can have strong theoretical guarantees if the regression functions and robust standard errors are carefully specified to accommodate the interference patterns under network experiments. We first recall a well-known result that the Hájek estimator is numerically identical to the coefficient from the weighted-least-squares fit based on the inverse probability of the exposure mapping. Moreover, we demonstrate that the regression-based approach offers three notable advantages: its ease of implementation, the ability to derive standard errors through the same regression fit, and the potential to integrate covariates into the analysis to improve efficiency. Recognizing that the regression-based network-robust covariance estimator can be anti-conservative under nonconstant effects, we propose an adjusted covariance estimator to improve the empirical coverage rates.
3. "Identification and Inference on Treatment Effects under Covariate-Adaptive Randomization and Imperfect Compliance," with Federico Bugni, Filip Obradovic, and Amilcar Velez. Submitted. [Arxiv]
Abstract: Randomized controlled trials (RCTs) frequently utilize covariate-adaptive randomization (CAR) (e.g., stratified block randomization) and commonly suffer from imperfect compliance. This paper studies the identification and inference for the average treatment effect (ATE) and the average treatment effect on the treated (ATT) in such RCTs with a binary treatment.
We first develop characterizations of the identified sets for both estimands. Since data are generally not i.i.d. under CAR, these characterizations do not follow from existing results. We then provide consistent estimators of the identified sets and asymptotically valid confidence intervals for the parameters. Our asymptotic analysis leads to concrete practical recommendations regarding how to estimate the treatment assignment probabilities that enter the estimated bounds. For the ATE bounds, using sample analog assignment frequencies is more efficient than relying on the true assignment probabilities. For the ATT bounds, the most efficient approach is to use the true assignment probability for the probabilities in the numerator and the sample analog for those in the denominator.
4. "On the Power Properties of Inference for Parameters with Interval Identified Sets," with Federico Bugni, Filip Obradovic, and Amilcar Velez. Revise & Resubmit at Econometric Theory. [Arxiv]
Abstract: This paper studies the power properties of confidence intervals (CIs) for a partially-identified parameter of interest with an interval identified set. We assume the researcher has bounds estimators to construct the CIs proposed by Stoye (2009), referred to as CI1, CI2, and CI3. We also assume that these estimators are "ordered": the lower bound estimator is less than or equal to the upper bound estimator.
Under these conditions, we establish two results. First, we show that CI1 and CI2 are equally powerful, and both dominate CI3. Second, we consider a favorable situation in which there are two possible bounds estimators to construct these CIs, and one is more efficient than the other. One would expect that the more efficient bounds estimator yields more powerful inference. We prove that this desirable result holds for CI1 and CI2, but not necessarily for CI3.
1. "Misspecified Regressions with Mixed Regressors," with Peng Ding.