Working Papers
Robust Econometrics for Growth at Risk. (Joint with Tobias Adrian and Yuya Sasaki.) arXiv.
The Growth-at-Risk (GaR) framework has garnered attention in recent econometric literature, yet current approaches implicitly assume a constant Pareto exponent. We introduce novel and robust econometrics to estimate the tails of GaR based on a rigorous theoretical framework and establish validity and effectiveness. Simulations demonstrate consistent outperformance relative to existing alternatives in terms of predictive accuracy. We perform a long-term GaR analysis that provides accurate and insightful predictions, effectively capturing financial anomalies better than current methods.
Genuinely Robust Inference for Clustered Data. (Joint with Harold D. Chiang and Yuya Sasaki.) arXiv.
This paper supersedes the manuscripts previously circulated under the titles “On the Inconsistency of Cluster-Robust Inference and How Subsampling Can Fix It” and “Non-Robustness of the Cluster-Robust Inference: with a Proposal of a New Robust Method.”
Conventional methods for cluster-robust inference are inconsistent when clusters of unignorably large size are present. We formalize this issue by deriving a necessary and sufficient condition for consistency, a condition frequently violated in empirical studies. Specifically, 77% of empirical research articles published in American Economic Review and Econometrica during 2020–2021 do not satisfy this condition. To address this limitation, we propose two alternative approaches: (i) score subsampling and (ii) size-adjusted reweighting. Both methods ensure uniform size control across broad classes of data-generating processes where conventional methods fail. The first approach (i) has the advantage of ensuring robustness while retaining the original estimator. The second approach (ii) modifies the estimator but is readily implementable by practitioners using statistical software such as Stata and remains uniformly valid even when the cluster size distribution follows Zipf’s law. Extensive simulation studies support our findings, demonstrating the reliability and effectiveness of the proposed approaches.
Binary Outcome Models with Extreme Covariates: Estimation and Prediction. (Joint with Laura Liu.) Supplementary Material. arXiv.
This paper presents a novel semiparametric method to study the effects of extreme events on binary outcomes and subsequently forecast future outcomes. Our approach, based on Bayes’ theorem and regularly varying (RV) functions, facilitates a Pareto approximation in the tail without imposing parametric assumptions beyond the tail. We analyze cross-sectional as well as static and dynamic panel data models, incorporate additional covariates, and accommodate the unobserved unit-specific tail thickness and RV functions in panel data. We establish consistency and asymptotic normality of our tail estimator, and show that our objective function converges to that of a panel Logit regression on tail observations with the log extreme covariate as a regressor, thereby simplifying implementation. The empirical application assesses whether small banks become riskier when local housing prices sharply decline, a crucial channel in the 2007–2008 financial crisis.
Estimating Export-productivity Cutoff Contours with Profit Data: A Novel Threshold Estimation Approach. (Joint with Peter H. Egger.) arXiv.
This paper develops a novel method to estimate firm-specific market-entry thresholds in international economics, allowing fixed costs to vary across firms alongside productivity. Our framework models market entry as an interaction between productivity and observable fixed-cost measures, extending traditional single-threshold models to ones with set-valued thresholds. Applying this approach to Chinese firm data, we estimate export-market entry thresholds as functions of domestic sales and surrogate variables for fixed costs. The results reveal substantial heterogeneity and threshold contours, challenging conventional single-threshold-point assumptions. These findings offer new insights into firm behavior and provide a foundation for further theoretical and empirical advancements in trade research.
High-Dimensional Tail Index Regression: with An Application to Text Analyses of Viral Posts in Social Media. (Joint with Yuya Sasaki and Jing Tao.) arXiv.
Motivated by the empirical observation of power-law distributions in the credits (e.g., “likes”) of viral social media posts, we introduce a high-dimensional tail index regression model and propose methods for estimation and inference of its parameters. First, we present a regularized estimator, establish its consistency, and derive its convergence rate. Second, we introduce a debiasing technique for the regularized estimator to facilitate inference and prove its asymptotic normality. Third, we extend our approach to handle large-scale online streaming data using stochastic gradient descent. Simulation studies corroborate our theoretical findings. We apply these methods to the text analysis of viral posts on X (formerly Twitter) related to LGBTQ+ topics.
Inference in Auctions with Many Bidders Using Transaction Prices. (Joint with Federico A. Bugni.) Supplementary Material. arXiv. Slides.
This paper studies inference in first- and second-price sealed-bid auctions with many bidders, using an asymptotic framework where the number of bidders increases while the number of auctions remains fixed. Relevant applications include online, treasury, spectrum, and art auctions. Our approach enables asymptotically exact inference on key features such as the winner’s expected utility, seller’s expected revenue, and the tail of the valuation distribution using only transaction price data. Our simulations demonstrate the accuracy of the methods in finite samples. We apply our methods to Hong Kong vehicle license auctions, focusing on high-priced, single-letter plates.
Fixed-k Tail Regression: New Evidence on Tax and Wealth Inequality from Forbes 400. (Joint with Ji Hyung Lee, Yuya Sasaki, and Alexis A. Toda.) arXiv.
We develop a new tail regression method to estimate the tail index (reciprocal of the Pareto exponent) of a size distribution as a function of macroeconomic state variables. Our method is motivated by the unique feature of the Forbes 400 data, which is a repeated cross-section of wealth truncated from below at the 400th largest order statistic. Applying this method, we find that higher capital income tax rates are associated with higher wealth Pareto exponents (lower top tail inequality). We present a simple economic model that explains these findings and discuss the welfare implication of capital taxation.
Capital and Labor Income Pareto Exponents in the United States, 1916-2019. (Joint with Ji Hyung Lee, Yuya Sasaki, and Alexis A. Toda.) arXiv.
Accurately estimating income Pareto exponents is challenging due to limitations in data availability and the applicability of statistical methods. Using tabulated summaries of incomes from tax authorities and a recent estimation method, we estimate income Pareto exponents in U.S. for 1916-2019. We find that during the past three decades, the capital and labor income Pareto exponents have been stable at around 1.2 and 2. Our findings suggest that the top tail income and wealth inequality is higher and wealthy agents have twice as large an impact on the aggregate economy than previously thought but there is no clear trend post-1985.