Beyond the Numbers: Soft Information in Conditional Asset Pricing, with Jian Feng, Shiyang Huang, and Ran Shi
Abstract: We investigate the asset pricing implications of soft information extracted from earnings conference calls. Augmenting standard firm characteristics with text-based signals significantly enhances mean-variance efficiency. Our evidence suggests a covariance channel: earnings calls provide incremental information about return covariances beyond that captured by firm characteristics alone. Text helps explain roughly one-third of the common return variation in individual stocks, with an even larger role for high-growth and intangible-intensive firms, where soft information is more important. Textual information accounts for a growing share of the mean-variance-efficient portfolio, surpassing firm characteristics in recent years.
Factor Identity, with Christian Julliard and Ran Shi
Abstract: The empirical dominance of dense factor models for pricing the cross-section of expected returns has come at a cost: the economic narrative of priced risk has disappeared. We introduce factor identities — groups of factors conveying the same pricing information despite modest correlations — and a Bayesian method to recover them, restoring legibility to the dense SDF. Only a handful emerge: a macroeconomic identity revealed through long-horizon growth in output, consumption, and industrial production; a market identity pinning down the level of risk premia; and a few characteristic-based identities that are fully subsumed by latent factors. The factor zoo is dense in animals but sparse in species.
Macro Strikes Back: Term Structure of Risk Premia, with Svetlana Bryzgalova and Christian Julliard
Abstract: We provide a novel priced Wold representation that, using the pricing restrictions of a large cross-section of asset returns, sharply identifies shocks common to financial markets and the macroeconomy, and their propagation. These shocks slowly propagate through major macro aggregates, account for 20-47% of their variation and most of their predictability, and trace their business cycle, disciplining all equilibrium models. This propagation, not the overall persistence of macro quantities, yields short-run macro risk premia that are negligible, yet match the equity premium at business cycle horizons. Validating the method, the model-implied prices of dividend strips match observed out-of-sample forward equity yields and their term structure, both conditionally and unconditionally. By identification through elimination, we rule out productivity, investment-specific technology, and pure preference shocks as likely origins of the business cycle. The data point to demand/belief shocks and cost-push shocks propagating through real, nominal, and informational rigidities.
Data Uncertainty in Financial Information, with Serhiy Kozak
Abstract: We study three fundamental data challenges in empirical asset pricing: missing observations, infrequent measurements, and inherent noise in financial information. These challenges make firm characteristics uncertain inputs rather than fixed conditioning variables. We develop a Bayesian tensor model that treats characteristics as latent, exploits cross-sectional, characteristic-level, and time-series dependence, and generates posterior panels for missing and stale values. In global equities, accounting for characteristic uncertainty leaves systematic factor portfolios nearly unchanged, but substantially reduces the number of statistically significant residual alphas and attenuates arbitrage-portfolio Sharpe ratios. Averaging arbitrage portfolio weights across imputations yields more stable performance, especially internationally.
Frequency-Dependent Risks in the Factor Zoo
Abstract: I dissect the factor zoo through the lens of frequency-dependent risks. Empirically, several low-frequency principal components constitute a proper benchmark stochastic discount factor (SDF) that achieves near-optimal out-of-sample performance. It effectively explains the cross-section of average anomaly returns not only at the monthly but also at business-cycle frequencies. Moreover, I decompose the SDF into two orthogonal pricing components. The first component is composed of high-frequency principal components. It is serially uncorrelated and relates to discount-rate news, intermediary factors, volatility risk, and investor sentiment. The second component is persistent and captures business-cycle risks related to consumption and GDP growth.
Consumption in Asset Returns, with Svetlana Bryzgalova and Christian Julliard, forthcoming at Journal of Finance
Abstract: Using information in returns we identify the stochastic process of consumption. We find that aggregate consumption reacts over multiple quarters to innovations spanned by financial markets. This persistent component accounts for over a quarter of consumption variation. These shocks command a large and significant risk premium, driving a large share of stocks and a small yet significant fraction of bonds' time series variation. Nevertheless, we find no support for stochastic volatility of consumption driving time-varying risk premia. Finally, an otherwise standard recursive utility model based on our estimated process explains equity premium and risk-free rate puzzles with low risk aversion.
Model Uncertainty in the Cross Section of Stock Returns, with Ran Shi, Journal of Econometrics, available online 22 July 2025, 106066.
Abstract: We develop a transparent Bayesian framework to measure uncertainty in asset pricing models. By assigning a modified class of g-priors to the risk prices of asset pricing factors, our method quantifies the trade-off between mean-variance efficiency and parsimony for asset pricing models to achieve high posterior probabilities. Model uncertainty is defined as the entropy of these model probabilities. We prove the model selection consistency property of our procedure, which is missing from the classic g-priors. Acknowledging the possibility of omitting true asset pricing factors in real applications, we also characterize the maximum degree of contamination that the omitted factors can introduce to our model uncertainty measure. Empirically, we find that model uncertainty escalates during major market events and carries a significantly negative risk premium of approximately half the magnitude of the market. Positive shocks to model uncertainty predict persistent outflows from US equity funds and inflows to Treasury funds.
Bayesian Solutions for the Factor Zoo: We Just Ran Two Quadrillion Models, with Svetlana Bryzgalova and Christian Julliard, Journal of Finance (2023), vol. 78(1), 487-557.
Full replication codes (including posterior draws and usage examples, 4.56GB)
BayesianFactorZoo R package on CRAN
Abstract: We propose a novel framework for analyzing linear asset pricing models: simple, robust, and applicable to high dimensional problems. For a (potentially misspecified) standalone model, it provides reliable price of risk estimates for both tradable and non-tradable factors, and detects those weakly identified. For competing factors and (possibly non-nested) models, the method automatically selects the best specification – if a dominant one exists – or provides a Bayesian model averaging (BMA-SDF), if there is no clear winner. We analyze 2.25 quadrillion models generated by a large set of factors, and find that the BMA-SDF outperforms existing models in- and out-of-sample.