Original Datasets

Language Structures Dataset

DOI: 10.7910/DVN/XIFPS5

The language_data.dta file contains gender-related language structure data for 492 languages as well as their geographical location. See the README.txt file for more information about data sources, as well as the paper and its appendix. The language_data_sources.tab file describes the original sources used to complement the language_data file.

Please cite as: Gay, Victor, Daniel L. Hicks, Estefania Santacreu-Vasut, and Amir Shoham. 2017. "Replication Data for: Decomposing culture: An Analysis of Gender, Language, and Labor Supply in the Household." Harvard Dataverse, V2.

Data Links

Below are links to public datasets used in my papers.

Economic Data

Educational attainment data from 1950 to 2010 for 146 countries, disaggregated by sex and 5-year age intervals. See Barro and Lee. 2013. "A New Data Set of Educational Attainment in the World, 1950–2010." Journal of Development Economics, 104, 184–198. DOI: 10.1016/j.jdeveco.2012.10.001

Geographical variables for 225 countries, including bilateral distance measures and contiguity indicators. See Mayer and Zignago. 2011. "Notes on CEPII's distances measures: the GeoDist Database." CEPII Working Paper, 2011–25.

Population, fertility, and migration data for a wide range of countries and years from the Population Division of the United Nations (DESA).

Genetic distance data for 206 countries. See Spolaore and Wacziarg. 2018. "Ancestry and Development: New Evidence." Journal of Applied Econometrics, 33(5), 748–762. DOI: 10.1002/jae.2633

Labor statistics for a wide range of countries and years from the International Labor Organization.

U.S. Census and American Community Survey microdata from 1850 to the present.

Income, output, input, and productivity data from 1950 to 2014 for 182 countries. See Feenstra, Inklaar, and Timmer. 2015. "The Next Generation of the Penn World Table." American Economic Review, 105(10), 3150–3182. DOI: 10.1257/aer.20130954

Linguistic Data

Typological data and geographical distribution for about 2,900 languages. See Bickel. 2002. "The AUTOTYP Research Program."

Structural properties for about 2,700 languages. See Dryer and Haspelmath (Eds). 2013. The World Atlas of Language Structures Online. Leipzig: Max Planck Institute for Evolutionary Anthropology.

Political Data

Transition information for the 280 autocratic regimes in existence from 1946 to 2010. See Geddes, Wright, and Frantz. 2014. "Autocratic breakdown and Regime Transitions: A New Data Set." Perspectives on Politics, 12(2), 313–331. DOI: 10.1017/S1537592714000851

Democracy and dictatorship database. Classification of political regimes across 202 countries from 1946 to 2008. See Cheibub, Gandhi, and Vreeland. 2010. "Democracy and Dictatorship Revisited." Public Choice, 143, 67–101. DOI: 10.1007/s11127-009-9491-2

Information on 272 parliamentary chambers in all of the 193 countries where a national legislature exists.

Worldwide information on quota provisions for women in parliament. See Dahlerup et al. 2014. Atlas of Electoral Gender Quotas. Inter-Parliamentary Union, Stockholm University. ISBN: 978-91-87729-09-6