Data & Code
Equity Anomaly data
Portfolio sorts
Stocks are sorted into N portfolios. Value-weighted returns within each portfolio. NYSE breakpoints.
Current data (July 1963 -- December 2019): portfolio sorts (daily), portfolio sorts (monthly), portfolio assignments.
References:
Haddad, Kozak, Santosh (2020) "Factor Timing": data (daily), data (monthly).
Giglio, Kelly, Kozak (2020) "Equity Term Structures without Dividend Strips Data": data (daily), data (monthly).
Kozak, Nagel, Santosh (2018) "Interpreting Factor Models" use an older version of these data.
Characteristic-managed portfolios
Portfolios are constructed by weighing each stock by its value of a characteristic signal. Firms with market equity below 0.01% of the aggregate US market cap are removed. Characteristics signals are equal to cross-sectional ranks of a given stock's characteristic, centered, and normalized by the sum of absolute values of all ranks in the cross section.
Current data (July 1963 -- December 2019): daily, monthly.
References:
Kozak, Nagel, Santosh (2020) "Shrinking the Cross-Section": data (daily), data (monthly), source code (a new version with L1L2 penalty), slides (TeX).
Kozak and Santosh (2020) "Why do Discount Rates Vary?" (used a subset of the data above).
Characteristic signals
This panel dataset contains values of characteristics signals for for each stock at any point in time. Firms with market equity below 0.01% of the aggregate US market cap are removed. Characteristics signals are equal to cross-sectional ranks of a given stock's characteristic, centered, and normalized by the sum of absolute values of all ranks in the cross section.
Current data (July 1963 -- December 2019): characteristic signals.
References:
Kozak (2020) "Kernel Trick for the Cross-Section": data.
synthetic Equity strip yield data (preliminary)
This dataset is constructed using the model in Giglio, Kelly, Kozak (2021) "Equity Term Structures without Dividend Strips Data". The data contain end-of-month equity yields, as defined by equation (27) in the paper (et,n). The data contain yields for the aggregate market index for maturities 1--100 years, and for the cross-section of 100 portfolios (50 long and 50 short ends of anomalies below) for maturities 1--15 years. Note that the S&P 500 strips data (tradable contracts) most closely corresponds to the sizeS cross-sectional portfolio in these data (large firms).
Current data (August 1975 -- September 2020): aggregate and cross-sectional synthetic equity strip yields.
References:
Giglio, Kelly, Kozak (2021) "Equity Term Structures without Dividend Strips Data": data.