Abstract: Large-scale microdata on group identity are critical for studies on identity politics and violence but remain largely unavailable for developing countries. We use personal names to infer religion in South Asia—where religion is a salient social division, and yet, disaggregated data on it are scarce. Existing work predicts religion using a dictionary-based method and, therefore, cannot classify unseen names. We provide character-based machine-learning models that can classify unseen names too with high accuracy. Our models are also much faster and, hence, scalable to large datasets. We explain the classification decisions of one of our models using the layer-wise relevance propagation technique. The character patterns learned by the classifier are rooted in the linguistic origins of names. We apply these to infer the religion of electoral candidates using historical data on Indian elections and observe a trend of declining Muslim representation. Our approach can be used to detect identity groups across the world for whom the underlying names might have different linguistic roots.

Working Papers

Best Paper in Development Economics at the Econometric Society Winter School 2020 (Delhi School of Economics, India) 

Abstract: While gender quotas in politics have been adopted worldwide, evidence on their impact on women’s substantive representation is mixed. To examine this issue, we estimate the relative importance of greater demand expressed by female voters under female leadership vis-à-vis female leaders’ differential preference (supply) in shaping the gender quota effect. We use data on the household level allocation of toilets for the entire rural population (25 million households) of Uttar Pradesh, India. Our empirical strategy exploits the larger gender gap in toilet preference among Muslims than Hindus and the greater expression of demand by female-headed households, relative to male-headed ones in a difference-in-discontinuity design to identify demand and supply mechanisms. We find that greater expression of demand is important in shaping the gender quota effect, while there is weak evidence of the supply mechanism. These results highlight the importance of empowering female voters in making gender quotas more effective.

Abstract: We examine employers' gender preferences using 157,888 job ads posted on an online job portal in India which received 6.45 million applications. We find that explicit gender requests by employers explain 7% of the gender wage gap in applications after accounting for job location and occupation. Implicit gender associations in job ad text indicating how predictive the text is of employers' gender preferencestogether with explicit gender requests explain 17% of this gap. We retrieve words predictive of gender requests and find that skills and flexibility-related gendered words play an important role in observed gender disparities.

Abstract: We examine how the group size of minorities affect their representation in national government and consequently, allocation of public resources under majoritarian (MR) and proportional (PR) electoral systems. We propose a novel theoretical framework that models spatial distribution of multiple minority groups in a probabilistic voting setup. It predicts that a minority’s population share has no effect on its representation and per capita resource allocation under PR, but has an inverted-U shaped relation under MR. We compile an ethnicity level panel data set comprising over 400 groups across 87 democracies for the entire post-World War II period that remarkably exhibits the same relationship for political representation and resource allocation. We replicate the results using two separate identification strategies–(i) instrumenting colony’s voting system by that of the primary colonial ruler and, (ii) comparing the same ethnicity across countries within a continent. The results imply how electoral systems can starkly affect power inequality across minorities and consequently, their well-being.

Works in Progress

Abstract: The effects of armed conflict on long-run development are not well understood. On the one hand, destruction of human and physical capital can create local poverty traps. But on the other hand, conflict can enhance collective action and foster cooperation among people in the affected regions. In this paper, I study the effects of conflict on long-term regional development at a more geographically disaggregated level than earlier studies using an unusually rich data set on aerial bombing missions of the anti-communist allies during the Vietnam War. To identify causal effects, I exploit discontinuities created by an algorithm used to target air strikes by the United States at over 15,000 South Vietnamese hamlets during 1970–1972. Using night lights as a proxy for present day development, I find that the average level of exposure to bombing increases the brightness at the hamlet coordinates in 2016 by 65 percent. The negative effects of war dissipate rapidly, and the positive effects persist through the 1990s to the 2000s. Higher density of local roads around the exposed hamlets as opposed to non-local roads suggests that local collective action could be an important channel driving these effects.