Working Papers:

Abstract: We examine employer preferences for hiring men vs women using newly collected data on approximately 160,000 job ads posted on an online job portal in India, linked with 6.45 million applications. We apply machine learning algorithms on text contained in job ads to predict an employer's gender preference. We find that advertised wages are lowest in jobs where employers prefer women - even when this preference is implicitly retrieved through the text analysis, and that these jobs also attract a larger share of female applicants. We then systematically uncover what lies beneath these relationships by retrieving words that are predictive of an explicit gender preference, or gendered words, and assigning them to the categories of hard and soft skills, personality traits, and flexibility. We find that skills-related female-gendered words have low returns but attract a higher share of female applicants while male-gendered words indicating decreased flexibility (e.g., frequent travel or unusual working hours) have high returns but result in a smaller share of female applicants. This contributes to a gender earnings gap. Our findings illustrate how gender preferences are partly driven by stereotypes and statistical discrimination.

Abstract: While gender quotas in politics have been adopted worldwide, evidence on their impact on women’s substantive representation is mixed. To examine this issue, we estimate the relative importance of greater demand expressed by female voters under female leadership vis-à-vis female leaders’ differential preference (supply) in shaping the gender quota effect. We use data on the household level allocation of toilets for the entire rural population (25 million households) of Uttar Pradesh, India. Our empirical strategy exploits the larger gender gap in toilet preference among Muslims than Hindus and the greater expression of demand by female-headed households, relative to male-headed ones in a difference-in-discontinuity design to identify demand and supply mechanisms. We find that greater expression of demand is important in shaping the gender quota effect, while there is weak evidence of the supply mechanism. These results highlight the importance of empowering female voters in making gender quotas more effective.

Abstract: Large-scale microdata on ethnicity are critical for studies on ethnic politics and violence but remain largely unavailable for developing countries. We use personal names to infer religion in South Asia where religion is a salient social division, and yet, disaggregated data on it are scarce. Existing work predicts religion using a dictionary-based method, and therefore, can not classify unseen names. We provide character-based machine learning models that can classify unseen names too with high accuracy. Our models are also much faster, and hence, scalable to large datasets. We explain the classification decisions of one of our models using the layer-wise relevance propagation technique. The character patterns learned by the classifier are rooted in the linguistic origins of names. We apply this infer the religion of electoral candidates using historical data on Indian elections and observe a trend of declining Muslim representation.

Abstract: We examine how the group size of minorities affect their representation in national government and consequently, allocation of public resources under majoritarian (MR) and proportional (PR) electoral systems. We propose a novel theoretical framework that models spatial distribution of multiple minority groups in a probabilistic voting setup. It predicts that a minority’s population share has no effect on its representation and per capita resource allocation under PR, but has an inverted-U shaped relation under MR. We compile an ethnicity level panel data set comprising over 400 groups across 87 democracies for the entire post-World War II period that remarkably exhibits the same relationship for political representation and resource allocation. We replicate the results using two separate identification strategies–(i) instrumenting colony’s voting system by that of the primary colonial ruler and, (ii) comparing the same ethnicity across countries within a continent. The results imply how electoral systems can starkly affect power inequality across minorities and consequently, their well-being.

Works in Progress:

  • "Shining Light on Vietnam: Long Term Effects of War on Regional Development"

Abstract: The effects of armed conflict on long-run development are not well understood. On the one hand, destruction of human and physical capital can create local poverty traps. But on the other hand, conflict can enhance collective action and foster cooperation among people in the affected regions. In this paper, I study the effects of conflict on long-term regional development at a more geographically disaggregated level than earlier studies using an unusually rich data set on aerial bombing missions of the anti-communist allies during the Vietnam War. To identify causal effects, I exploit discontinuities created by an algorithm used to target air strikes by the United States at over 15,000 South Vietnamese hamlets during 1970–1972. Using night lights as a proxy for present day development, I find that the average level of exposure to bombing increases the brightness at the hamlet coordinates in 2016 by 65 percent. The negative effects of war dissipate rapidly, and the positive effects persist through the 1990s to the 2000s. Higher density of local roads around the exposed hamlets as opposed to non-local roads suggests that local collective action could be an important channel driving these effects.

  • "Pandemic and Polarization: Insights from Twitter" (with Rochana Chaturvedi)