Martin Saavedra

Associate Professor of Economics

Rutgers University, New Brunswick

Lasso Industry Demographic and Occupation (LIDO) Score

These data are intended to be an improvement on the traditional Occupational Income Scores from IPUMS. The occupational income score predicts 1950 earnings using only occupation, while the LIDO score predicts earnings using occupation, industry, and demographics. The details are contained in the paper: A Machine Learning Approach to Improving Occupational Income Scores. There are both theoretical and empirical reasons to believe this should produce estimates closer to an earnings regression. To use the LIDO score, you will need the following variables from IPUMS: region, statefip, sex, age, race, occ1950, and ind1950. It should then only take one line of Stata code to merge in the score. The LIDO score is in hundreds of 1950 dollars. The data will be updated when the 1950 full count becomes available.

Download the LIDO score.Â