Yulong Wang 王宇龙

Associate Professor in Economics

Senior Research Associate in Center for Policy Research

Maxwell School of Citizenship and Public Affairs at Syracuse University

Research Interests: Econometrics and Applied Econometrics: Time Series,  Extreme Value Theory, and Rare Events

Email: ywang402@syr.edu

Personal Information: CV  Google Scholar  Some Fun Travel Photos 

Education

Sep. 2012-Jun. 2017     Ph.D., Economics, Princeton University

Sep. 2012-Jun. 2014     M.A., Economics, Princeton University

Sep. 2010-Jun. 2012     M.A., Economics, University of California at Los Angeles

Sep. 2006-Jun. 2010     B.A., Finance, Tsinghua University

Published and Forthcoming Papers

Working Papers

We develop a new tail regression method to estimate the tail index (reciprocal of the Pareto exponent) of a size distribution as a function of macroeconomic state variables. Our method is motivated by the unique feature of the Forbes 400 data, which is a repeated cross-section of wealth truncated from below at the 400th largest order statistic. Applying this method, we find that higher capital income tax rates are associated with higher wealth Pareto exponents (lower top tail inequality). We present a simple economic model that explains these findings and discuss the welfare implication of capital taxation. (This paper was formerly circulated under the title "Censored Tail Regression: New Evidence on Tax and Wealth Inequality from Forbes 400.")

The literature often employs moment-based earnings risk measures like variance, skewness, and kurtosis. However, under heavy-tailed distributions, these moments may not exist in the population. Our empirical analysis reveals that population kurtosis, skewness, and variance often do not exist for the conditional distribution of earnings growth. This challenges moment-based analyses. We propose robust conditional Pareto exponents as novel earnings risk measures, developing estimation and inference methods. Using the UK New Earnings Survey Panel Dataset (NESPD) and US Panel Study of Income Dynamics (PSID), we find: 1) Moments often fail to exist; 2) Earnings risk increases over the life cycle; 3) Job stayers face higher earnings risk; 4) These patterns persist during the 2007–2008 recession and the 2015–2016 positive growth period. (This paper was formerly circulated under the title "How Unequally Heavy Are the Tails of the Distributions of Income Growth?")

Accurately estimating income Pareto exponents is challenging due to limitations in data availability and the applicability of statistical methods. Using tabulated summaries of incomes from tax authorities and a recent estimation method, we estimate income Pareto exponents in U.S. for 1916-2019. We find that during the past three decades, the capital and labor income Pareto exponents have been stable at around 1.2 and 2. Our findings suggest that the top tail income and wealth inequality is higher and wealthy agents have twice as large an impact on the aggregate economy than previously thought but there is no clear trend post-1985.  


This paper presents two results concerning uniform confidence intervals for the tail index and the extreme quantile. First, we show that there exists a lower bound of the length for confidence intervals that satisfy the correct uniform coverage over a nonparametric family of tail distributions. Second, in light of the impossibility result, we construct honest confidence intervals that are uniformly valid by incorporating the worst-case bias in the nonparametric family. The proposed method is applied to simulated data and real data of financial time series.


The conventional cluster-robust (CR) standard errors may not be robust. They are vulnerable to data that contain a small number of large clusters. When a researcher uses the 51 states in the U.S. as clusters, the largest cluster (California) consists of about 10% of the total sample. Such a case in fact violates the assumptions under which the widely used CR methods are guaranteed to work. We formally show that the conventional CR methods fail if the distribution of cluster sizes follows a power law with exponent less than two. Besides the example of 51 state clusters, some examples are drawn from a list of recent original research articles published in a top journal. In light of these negative results about the existing CR methods, we propose a weighted CR(WCR) method as a simple fix. Simulation studies support our arguments that the WCR method is robust while the conventional CR methods are not.

Health expenditure data almost always include extreme values, implying that the underlying distribution has heavy tails. This may result in infinite variances as well as higher-order moments and bias the commonly used least squares methods. To accommodate extreme values, we propose an estimation method that recovers the right tail of health expenditure distributions. It extends the popular two-part model to develop a novel three-part model. We apply the proposed method to claims data from one of the biggest German private health insurers. Our findings show that the estimated age gradient in health care spending differs substantially from the standard least squares method.

This paper considers inference in first-price and second-price sealed-bid auctions in empirical settings where we observe auctions with a large number of bidders. Relevant applications include online auctions, treasury auctions, spectrum auctions, art auctions, and IPO auctions, among others. Given the abundance of bidders in each auction, we propose an asymptotic framework in which the number of bidders diverges while the number of auctions remains fixed. This framework allows us to perform asymptotically exact inference on key model features using only transaction price data. Specifically, we examine inference on the expected utility of the auction winner, the expected revenue of the seller, and the tail properties of the valuation distribution. Simulations confirm the accuracy of our inference methods in finite samples. Finally, we also apply them to Hong Kong car license auction data.

Conventional methods of cluster-robust inference are inconsistent in the presence of unignorably large clusters. We formalize this claim by establishing a necessary and sufficient condition for the consistency of the conventional methods. We find that this condition for the consistency is rejected for a majority of empirical research papers. In this light, we propose a novel score subsampling method which is robust even under the condition that fails the conventional method. Simulation studies support these claims. With real data used by an empirical paper, we showcase that the conventional methods conclude significance while our proposed method concludes insignificance.

Motivated by the empirical power law of the distributions of credits (e.g., the number of “likes”) of viral posts in social media, we introduce the high-dimensional tail index regression and methods of estimation and inference for its parameters. We propose a regularized estimator, establish its consistency, and derive its convergence rate. To conduct inference, we propose to debias the regularized estimate, and establish the asymptotic normality of the debiased estimator. Simulation studies support our theory. These methods are applied to text analyses of viral posts in X (formerly Twitter) concerning LGBTQ+.