I estimate wage‑equivalent job‑switching costs for older U.S. workers and show that pension coverage substantially increases these frictions. Using panel data from the Health and Retirement Study for individuals aged 50-75 and a dynamic discrete‑choice model with unobserved heterogeneity, I find that, relative to workers without pensions, defined‑contribution (DC) coverage raises switching costs by 8% of the average hourly wage and defined‑benefit (DB) coverage by 20%. During the COVID-19 period, DB holders incurred an additional $1,800 per year in switching costs, while DC costs were unchanged, implying lower responsiveness to outside offers among DB holders. Partial‑equilibrium counterfactuals show that cutting pension‑related frictions by 50% raises job‑to‑job mobility by about 85% for DB holders and 28% for DC holders and modestly delays retirement. Aligning DB frictions to DC levels produces similar gains for DB holders. Implied wage‑equivalent compensating variation averages $50–$240 per worker‑year across reforms. Results are robust to alternative state space, control specifications, and sample periods.
Job mobility, Switching costs, Pensions, Dynamic discrete choice, Older workers.
Presented at Applied Economics Seminar at Graduate Center, CUNY (Nov 2023)
We demonstrate that machine learning substantially improves predictions of individual decisions about retirement and Social Security (SS) claims. When predicting the number of people receiving SS, we achieve an error of less than 1%, while the benchmark model employed by the Social Security Administration (SSA) results in a greater than 4% error, and in forecasting SS claiming decisions, we attain an error of 0.2%, while the benchmark exceeding 2%. Based on averages, we show that the 3% difference in prediction amounts to 39.6 billion dollars annually. The set of important variables selected by our model significantly differs from that of the SSA model. We use Shapley values to evaluate the non-linear contributions of the selected variables to predictive outcomes.
Retirement and Social Security, Machine learning, Prediction, Gradient boosted tree, Shapley values
Presented in a lightning round session at the 2025 ASSA Annual Meeting
We study external validity within the context of instrumental variable estimation. The key assumption we impose for external validity is conditional external unconfoundedness among compliers, which means that the treatment effect and target selection are independent among compliers conditional on covariates. We study this assumption by using a case study about the impact of solid fuel usage on women’s average cooking time. Among the six countries examined, we find no statistical evidence that the assumptions for external validity are violated for four countries (Ethiopia, Honduras, Kenya, and Zambia). Conversely, in Cambodia and Nepal, we find low external validity. These results provide suggestive evidence that the assumptions for external validity are violated for these two countries.
External validity, Instrumental variable, Generalization, Prediction, Solid fuel impact, Time usage, Developing countries
Using data from the Multi-Tier Framework Survey (MTF) conducted in Nepal, we explore how geographical factors, specifically land slope and elevation, impact the adoption of LPG stoves. We employ a logit model to analyze the factors influencing households’ choices regarding LPG stove adoption and generate slope and elevation data. Overall, we find that the estimates of the average slope are robust and statistically significant, but those for the average elevation are sensitive to model specifications and smaller compared to the slope. We find that the Kathmandu region is important in the analysis, and slope and elevation have nonlinear effects. Additionally, we show that geographical factors are similarly important across different household expenditure quintile groups, except for the lowest.
Nepal, Geographical factors, Slope and elevation, Household fuel choice, LPG stoves, Logistic regression
Dynamic models with discrete choices are widely used in economics but are often hard to solve and estimate in empirically realistic settings. Discrete actions make the problem non-convex and induce non-smooth decision rules and value functions, while high-dimensional state vectors make traditional grid-based dynamic programming infeasible. This review surveys computational advances for dynamic discrete and discrete--continuous choice models, organized around the key bottlenecks in \emph{solving} and \emph{estimating} these models. On the solution side, we introduce modern machine learning approaches that help the solution methods in high-dimensional state spaces. On the estimation side, we summarize benchmark full-solution and two-step conditional-choice-probability methods and emphasize post-2010 developments that reduce the computational burden, including constraint-based and Euler-type formulations, continuous-time reformulations, approximation/aggregation strategies, and machine-learning-assisted approaches.
Dynamic discrete choice, Discrete–continuous dynamic programming, Machine learning, Computational econometrics
Accepted for a poster session at the 2025 ASSA Annual Meeting
We study the impact of policy loan programs on the productivity of small and medium enterprises (SME) in South Korea from 2013 to 2022. Using a robust Difference-in-Differences estimator for average treatment effects (ATE) and a Generalized Random Forest for conditional ATE, we focus on two productivity measures: total factor productivity (TFP) and its persistent component. We find that subsidies have a positive and increasing, but delayed, effect on TFP. Meanwhile, we find a positive and increasing, but not significant, effect on the persistent component of productivity, which warrants a cautious interpretation of the policy's ability to encourage sustainable long-run growth. We also find significant heterogeneity: TFP gains are greatest for financially constrained firms, which are usually younger and smaller firms. The heterogeneity in persistent productivity is even more pronounced, suggesting that a "one-size-fits-all" approach is insufficient for long-term impact. Instead, targeted support is necessary to promote sustainable growth among SMEs.
Firm-level productivity, Financial constraints, Difference-in-differences with multiple treatment periods, Generalized Random Forest, SME
In this paper, I approximate the solution of a discrete-continuous dynamic choice model using deep neural networks. The model of interest is an overlapping generation (OLG) model in which the individual needs to make two decisions: how much to consume, and when to retire. Overall, the deep neural networks designed in this paper approximate the analytic solution of consumption and the retirement probability relatively well. However, some points are relatively far from the analytic solution at the kink points and the discontinuity points, especially for the individual of age 1.
Machine learning, Deep learning, Neural Network, Overlapping generation model, Discrete and continuous choice
Accepted for a poster session at the 2026 ASSA Annual Meeting
Empirical evidence from instrumental variable (IV) studies often guides policy decisions beyond the original study setting. Yet IV estimates identify local average treatment effects (LATEs) that may not generalize when the composition of compliers differs across populations. This paper examines how machine learning methods can improve the external validity of IV estimates. Using an empirical application on the effect of solid fuel use on cooking time across six developing countries and a series of simulation experiments, we compare the benchmark interacted two-stage least squares estimator with fixed effects (2SLS-IF) to a Double/Debiased Machine Learning (DML) approach. The DML estimator delivers more accurate out-of-sample predictions of LATEs when treatment effect heterogeneity and selection are driven by observable characteristics, outperforming 2SLS-IF under model misspecification. We also propose an algorithmic procedure for hyperparameter tuning (MLtune) that enhances the stability and generalization of DML predictions. These findings show that flexible machine learning estimators can meaningfully strengthen external validity in IV analyses, though their success depends on the nature of selection across settings.