Publications
"Innovative Ideas and Gender (In)equality", American Economic Review 115.7 (2025): 2207-2236.
Racial Inequality and Publication in Economics, (with Roland Pongou and Leonard Wantchekon) AEA Papers and Proceedings, Vol. 114, May 2024.
Gendered Citations at Top Economic Journals, AEA Papers and Proceedings, Vol.111 , May 2021.
Racial Justice from Within? Diversity and Inclusion in Economics, (with Leonard Wantchekon), Econometric Society Monograph, October 2025.
Working Papers
******* Economics of Innovation and Science
Cassatts in the Attic, 2023 (with Matt Marx), Accepted, American Economic Journal: Applied Economics
We analyze more than 70 million scientific articles to characterize the gender dynamics of commercializing science. We report a gap of 21%, which is explained neither by the quality of the science nor its ex-ante commercial potential. Moreover, the gender gap is widest among papers with female last authors (i.e., lab head or PI) when publishing high-quality science. However, the gap vanishes when authors {self}-commercialize discoveries via new ventures, and it is reduced when commercializing in cooperation with firms that are smaller or have more female inventors.
This paper examines racial disparities in the diffusion of ideas using a comprehensive dataset comprising over 330,000 economics publications (1950–2021) alongside hand-collected CV data. We document a persistent citation gap: Papers authored by Black, Hispanic, or Asian scholars receive 5.1% to 9.6% fewer citations than those authored by White scholars. This gap, which remains or widens upon considering author seniority and standard quality metrics, is particularly pronounced for Black authors. It also extends to indirect citations and is associated with more clustered citation networks among non-White scholars. Topic choice does not explain this gap; instead, we uncover strong evidence of in-group citation patterns. Thus, we propose a conceptual framework in which clustered networks shaped by information frictions, perceptions, and preferences drive these disparities. Empirical tests looking at variation in online availability, the timing of cross-group collaborations, and author-ordering conventions support this framework over quality-based explanations. Finally, using natural language techniques, we provide suggestive evidence of intellectual complementarity across racial groups, highlighting the potential costs associated with barriers to idea diffusion.
Recognition In Academia: Evidence From 80 Million Publications
This study investigates gender and racial disparities in citation practices across 250 academic disciplines, using a dataset of 80 million Web of Science publications (1985–2024). Leveraging transformer-based models to define citation risk sets, we detect systematic undercitation of work by women and non-White scholars in over half of all fields. These disparities are partly driven by homophily, though they tend to decline with larger author teams and longer reference lists. Importantly, some disciplines exhibit no such gaps. Our findings are robust to extensive controls and suggest that practices in more equitable fields may offer guidance for fostering unbiased academic practices and enhancing scientific productivity.
Pricing Innovation: Evidence from Canadian Pharmaceuticals (with Vasia Panousi)
This paper uses a new panel dataset constructed from information provided by the Canadian Intellectual Property Office to study the relationship between patents, innovation, and growth in the Canadian pharmaceutical industry. First, using advanced machine learning methods, we perform textual analysis on patent documents to create an indicator of patent quality. Our indicator assigns higher quality to patents or innovations that are novel. Second, by matching the firms in our patent dataset with their balance-sheet information, we can validate our patent-quality measure by relating it to various measures of firm value and performance. The results indicate that the anticipation of the granting of a breakthrough patent increases firm profitability, on average, for up to five years before the grant. Finally, our quality index is used for policy purposes in the pharmaceutical sector. In fact, the quality index shows a positive and significant relationship with the prices of the patented medicines at the federal level. Surprisingly, at the provincial level, this positive relationship disappears, following unilateral price negotiations between each province and each drug manufacturer. Finally, a return to collective bargaining via the pan-Canadian Pharmaceutical Alliance appears to correct price-setting inefficiencies at the provincial level.
******* Broader Applications of Machine Learning to Economics
High Risk Workers and High Risk Firms (with Serdar Ozkan, Sergio Salgado, and Marco Weißler )
Recent literature documented large heterogeneity in earnings dynamics individuals experience, in particular, in average income profiles and higher order moments of income shocks as well as in unemployment risk and job finding rates. Using administrative social security data from Germany, we decompose heterogeneity in earnings dynamics into observable and unobservable worker and firm components. First, we document salient features of earnings risk conditional on observable worker and firm characteristics. Next, in order to identify unobservable heterogeneity we employ machine learning algorithms to cluster workers and firms by features of their earnings dynamics. Finally, we estimate an individual income process that allows for ex-ante worker and firm heterogeneity. We find that workers in smaller and shrinking firms experience lower earnings growth with more volatile and left skewed earnings changes as well as higher unemployment risk. When we control for unobservable worker and firm types jointly, we conclude that person effects explain majority of differences in earnings dynamics while firm effects explain relatively little. These findings have important implications for public policy such as unemployment insurance..
Predicting the Effectiveness of Teachers: a Machine Learning Approach (with William Arbour and Phil Oreopoulos)
******* Macro-finance
Covariance of Domestic Investment Shocks and Global Financial Markets (with Vasia Panousi)
We investigate the role of global financial markets when domestic investment is subject to both aggregate and idiosyncratic shocks, where these shocks covary countercyclically. On the theoretical front, a new two-country stochastic general-equilibrium model shows that financial integration mitigates the negative effects of uninsurable countercyclical idiosyncratic risk on investment and on other macroeconomic and financial aggregates. This mitigation is due to the international portfolio diversification opportunities provided by financial integration and is present only when there is a covariance in the domestic risk structure. On the empirical front, using a panel of 79 countries over 1985-2018, we find that the negative effect of idiosyncratic risk on economic growth is mitigated by 40 percent when a country's openness to international financial markets increases. We also find similar results in a macro-level investment panel of US firms. Furthermore, the empirical results are comparable in magnitude to those of a reasonably calibrated version of the theoretical model.
Assessing Debt Sustainability: An Enhanced Signal Extraction Approach (with Nadeem Sanaa) [IMF staff internal paper August 2018, Draft available upon request]
For early and effective policy responses that minimize economic costs, it is vital to have a reliable framework for predicting the likelihood of a sovereign debt crisis. In doing so, this paper investigates several ways of enhancing the current debt sustainability framework. We estimate a non-parametric model based on signal extraction and find three elements, key in leading the predictive power of a given estimation: the choice of the objective function, the choice of the variables and their aggregation into a composite index, the heterogeneity among countries. In addition, we explore a multivariate signalling approach which appears to be a parsimonious and promising avenue in predicting debt distress event. Finally, we apply our methodology on the new crisis database and find substantial improvements both in-sample and out-of-sample compared to the existing framework.