Working Papers
Nonlinear Synthetic Control and Unconfoundness Approach [pdf] [appendix] [slides]
(2025.09.16 Updated)
Job market paper
Abstract: In this manuscript, we extend the synthetic control method to a nonlinear setting, where weights are chosen by matching both the outcome levels and their nonlinear transformations. We motivate the method by introducing a linear factor model in which the outcome level can directly depend nonlinearly on its lagged values, which we refer to as a dynamic linear factor model. In this context, we derive an error bound for estimating counterfactual potential outcomes and demonstrate that this bound can be significantly reduced by matching nonlinear transformations of the pre-treatment outcome variables. Additionally, we link the synthetic control method to the machine learning literature by reinterpreting the synthetic control method as a maximal mean discrepancy (MMD) minimization problem. The feature map therein serves as the nonlinear transformation in our proposed method. As part of our theoretical investigation, we establish a dual relationship between the synthetic control problem and an unconfoundedness problem. Specifically, we show that the asymptotic normality of synthetic control estimate, even when the number of features is increasing with respect to the sample size. We illustrate the performance of our proposed nonlinear synthetic control method in both simulation studies and an empirical example.
Functional Synthetic Control [pdf] [appendix] (2025.11.04 Updated)
Abstract: This work generalizes the synthetic control method to when observations are distribution functions instead of scalars. We propose the functional synthetic control (FSC) method, which matches pretreatment quantile functions by minimizing distances in the functional space. We show when the posttreatment outcome is a scalar, the functional synthetic control procedure has a dual functional predictor regression under ridge regression and OLS. We demonstrate estimation procedure both using the whole distribution function, or using Gaussian quadratures to approximate the model. When the true data-generating process is a functional predictor regression model, we provide an estimation error bound for the FSC estimator. Additionally, we outline a diagnostic permutation test for inference. Finally, we apply our approach to study the distributional effect of a minimal wage raise in Alaska in 2003.
Optimal Subsidy Rule with Multiple Outcomes [pdf] [appendix] (2025.09.18 Updated)
Abstract: This paper considers the policy maker's problem of designing an optimal subsidy scheme to maximize the expected social welfare when multiple discrete choices are available in the market. Subsidies are assumed to affect the social welfare by both alternating the choice and potential outcomes. The structural model consists of a discrete choice model for consumers' choice, and a potential outcome framework for the realized outcome. A welfare analysis shows the optimal subsidy scheme is weakly better than a direct assignment policy under a generalization of the ``all-complier'' assumption in the binary choice setting. A simulation shows the validity of our proposed method. Finally, the procedure is applied to decide the optimal scheme of a fertilizer subsidy program in Malawi.
In progress
Inference-optimal Local Polynomial Order Selection for Regression Discontinuity Designs [pdf]
Abstract: This manuscript proposes a new procedure for selecting the order of local polynomial fitting in regression discontinuity design. The proposed method is designed to ensure that the resulting confidence interval has a coverage probability closest to the desirable level. We review the procedures for bandwidth and order selection to minimize the mean squared error and the coverage probability error of the associated confidence interval, respectively. In particular, the chosen bandwidth and polynomial order of different purpose may differ from each other. In a simulation, we demonstrate that our proposed method selects the polynomial order associated with confidence intervals that have proper converge probability, compared to arbitrarily choosing the order to be linear or quadratic. This illustrates the optimality of our procedure in terms of inference.