Presented in a lightning round session at the 2025 ASSA Annual Meeting
We study external validity within the context of instrumental variable estimation. The key assumption we impose for external validity is conditional external unconfoundedness among compliers, which means that the treatment effect and target selection are independent among compliers conditional on covariates. We study this assumption by using a case study about the impact of solid fuel usage on women’s average cooking time. Among the six countries examined, we find no statistical evidence that the assumptions for external validity are violated for four countries (Ethiopia, Honduras, Kenya, and Zambia). Conversely, in Cambodia and Nepal, we find low external validity. These results provide suggestive evidence that the assumptions for external validity are violated for these two countries.
External validity, Instrumental variable, Generalization, Prediction, Solid fuel impact, Time usage, Developing countries
Using data from the Multi-Tier Framework Survey (MTF) conducted in Nepal, we explore how geographical factors, specifically land slope and elevation, impact the adoption of LPG stoves. We employ a logit model to analyze the factors influencing households’ choices regarding LPG stove adoption and generate slope and elevation data. Overall, we find that the estimates of the average slope are robust and statistically significant, but those for the average elevation are sensitive to model specifications and smaller compared to the slope. We find that the Kathmandu region is important in the analysis, and slope and elevation have nonlinear effects. Additionally, we show that geographical factors are similarly important across different household expenditure quintile groups, except for the lowest.
Nepal, Geographical factors, Slope and elevation, Household fuel choice, LPG stoves, Logistic regression
Presented at Applied Economics Seminar at Graduate Center, CUNY (Nov 2023)
IWe demonstrate that machine learning substantially improves predictions of individual decisions about retirement and Social Security (SS) claims. When predicting the number of people receiving SS, we achieve an error of less than 1%, while the benchmark model employed by the Social Security Administration (SSA) results in a greater than 4% error, and in forecasting SS claiming decisions, we attain an error of 0.2%, while the benchmark exceeding 2%. Based on averages, we show that the 3% difference in prediction amounts to 39.6 billion dollars annually. The set of important variables selected by our model significantly differs from that of the SSA model. We use Shapley values to evaluate the non-linear contributions of the selected variables to predictive outcomes.
Retirement and Social Security, Machine learning, Prediction, Gradient boosted tree, Shapley values
In this paper, I approximate the solution of a discrete-continuous dynamic choice model using deep neural networks. The model of interest is an overlapping generation (OLG) model in which the individual needs to make two decisions: how much to consume, and when to retire. Overall, the deep neural networks designed in this paper approximate the analytic solution of consumption and the retirement probability relatively well. However, some points are relatively far from the analytic solution at the kink points and the discontinuity points, especially for the individual of age 1.
Machine learning, Deep learning, Neural Network, Overlapping generation model, Discrete and continuous choice
I estimate the job-switching cost of older workers using a dynamic discrete choicen framework. I find that holding either a DB or DC pension significantly increases the switching cost relative to workers without pension coverage. Using the estimates of the structural parameters, I calculate that a DC pension increases the switching cost by $98 in hourly wages, while a DB pension increases it by approximately $120. Given that the mean hourly wage in the worker sample is $14.5 and the maximum in the total sample is around $229, these increases are substantial, resulting in barriers for old workers’ job mobility.
Accepted for a poster session at the 2025 ASSA Annual Meeting
This study empirically analyzes the impact of policy loan programs on the productivity of small and medium companies (SMEs). Using firm-level panel data from South Korea between 2013 and 2022, we estimate the treatment effects of subsidies on two distinct productivity measures: total factor productivity (TFP) estimated via the Wooldridge (2009) method and its persistent component. To address methodological challenges arising from staggered treatment adoption, we employ the robust Difference-in-Differences estimator proposed by Callaway and Sant'Anna (2021). Furthermore, we utilize a Generalized Random Forest model to uncover heterogeneous treatment effects. We find a positive but delayed and increasing effect of subsidies on firm productivity. In contrast, we do not find a significant effect of subsidies on the persistent component of productivity, even though the effect is positive and increasing. These results suggest we should be careful with the interpretation that subsidies have a positive effect on sustainable long-run growth. In addition, we identify significant heterogeneity in treatment. The effects on TFP are greatest for financially constrained firms, which are often younger, smaller firms with "thin" information profiles, highlighting the need for policies targeted at such firms. The heterogeneity in persistent productivity is even more pronounced, implying that, for long-term impact, a "one-size-fits-all" approach is insufficient. Instead, targeted and selective support, potentially combining financial aid with business consulting, is necessary to foster sustainable SME growth.
Firm-level productivity, Financial constraints, Difference-in-differences with multiple treatment periods, Generalized Random Forest, SME