Research
Work In Progress
6. The Origin of the State: Land Productivity or Appropriability? Comment
Journal of Political Economy (Revise and Resubmit)
(with Thibaut Duprey, Anthony Heyes, and Martino Pelli)
Mayshar et al. (2022) claim that cultivation of cereals (appropriable by elites) caused the development of the state, rather than increased land productivity due to agriculture. In this comment, we show that the appropriability theory (1) may hold when moving from a chiefdom to a small state and (2) is driven by the top 3% of observations where cereals offer exceptionally better caloric productivity.
5. The Economic Effects of Long-Term Climate Change: Evidence from the Little Ice Age: Comment
Economic Inquiry (Revise and Resubmit)
(with Hugo Cordeau, Tongzhe Li, and Taylor Wright)
Waldinger (2022) finds significant economic effects following climate change during 1600-1850. The main result states that temperature and city size are positively correlated; a 1 degree Celsius warmer 50-year period is followed by a 70% increase in city size. We show the main result suffers from problems stemming from censored data: Cities with less than 1000 inhabitants are coded as 0 in the data and assumed to have 500. This assumption affects 23.5% of observations and 49.6% of cities. When accounting for heterogeneity by city size (and by proxy censorship status), we find distinct patterns in the relationship between temperature and city size. Specifically, in small cities (where the population has ever been less than 1000), the relationship between temperature and city size is positive, large, and statistically significant. Conversely, in large cities (where the population has never fallen below 1000 and thus avoid censorship) the relationship is negative, large, and statistically significant. In order to tackle the censored (and discretized) distribution of city sizes in the data, we draw lessons from Eeckhout (2004) in order to close a large ‘gap’ in the distribution between the censored and observed city sizes.
Under Review
(as third author with Abel Brodeur and Derek Mikola)
This study pushes our understanding of research reliability by reproducing and replicating claims from 110 papers in leading economic and political science journals. The analysis involves computational reproducibility checks and robustness assessments. It reveals several patterns. First, we uncover a high rate of fully computationally reproducible results (over 85%). Second, excluding minor issues like missing packages or broken pathways, we uncover coding errors for about 25% of studies, with some studies containing multiple errors. Third, we test the robustness of the results to 5,511 re-analyses. We find a robustness reproducibility of about 70%. Robustness reproducibility rates are relatively higher for re-analyses that introduce new data and lower for re-analyses that change the sample or the definition of the dependent variable. Fourth, 52% of re-analysis effect size estimates are smaller than the original published estimates and the average statistical significance of a re-analysis is 77% of the original. Lastly, we rely on six teams of researchers working independently to answer eight additional research questions on the determinants of robustness reproducibility. Most teams find a negative relationship between replicators' experience and reproducibility, while finding no relationship between reproducibility and the provision of intermediate or even raw data combined with the necessary cleaning codes.
(with Abel Brodeur and Anthony Heyes)
Online experiments have exploded in popularity in recent years, and Amazon Mechanical Turk is by far the most widely-used platform for such experiments in business and economics research. But how trustworthy are the results from well-published studies that use it? Analyzing the universe of hypotheses tested on the platform and published in a set of leading journals between 2010 and 2020 we find evidence of widespread p-hacking, publication bias and over-reliance on results from under-powered studies. As such, even ignoring doubts about the characteristics and behaviors of study recruits, which have been widely discussed, the way in which the research community itself uses the platform erodes substantially the credibility of the results of these studies. A high proportion of the results from these studies would be unlikely to replicate. The extent of the problems vary across the economics, business and cognate fields (with marketing especially afflicted) and do not appear to be getting better over time. We explore correlates of increased credibility.
2. Stay Frosty: Climate Change and Gun Violence in North America
(with Taylor Wright)
Hotter temperatures due to climate change are expected to increase interpersonal violence, alongside many other social and economic costs. In this paper, we find that the largest effects of temperature on gun violence in Chicago are found not through hot days getting hotter but through cold days growing warmer. We find that gun violence increases by 1% for every 1C warmer daily temperature. Notably, we use data both from automated gun violence reporting and traditional police reports to find similar results, suggesting that it is the incidence of crime (rather than the previously composite crime incidence and probability it is reported) that is affected by temperature. We find no effects of previous-day temperatures, discouraging a hypothesis of pre-meditation and instead suggesting milder-temperature opportunity as the potential mechanism. We also find evidence that the COVID-19 pandemic increased the sensitivity of gun violence to temperature. Taken together, our results suggest that research focused on hotter days due to climate change likely underestimate the future potential increase in gun violence.
- Statistical Significance and Science Mobilization: Evidence from 10,404 Hypotheses in Leading Health Journals
(with Abel Brodeur, Anthony Heyes, and Taylor Wright)
(Draft not yet available - but see me present it at UC Berkeley in the video!)
Publications
Journal of Political Economy: Microeconomics, August 2024
(with Abel Brodeur, Jonathan Hartley, and Anthony Heyes)
Preregistration is regarded as an important contributor to research credibility. We investigate this by analyzing the pattern of test statistics from the universe of randomized controlled trial studies published in 15 leading economics journals. We draw two conclusions: (a) Preregistration frequently does not involve a preanalysis plan (PAP), or sufficient detail to constrain meaningfully the actions and decisions of researchers after data are collected. Consistent with this, we find no evidence that preregistration in itself reduces p-hacking and publication bias. (b) When preregistration is accompanied by a PAP we find evidence consistent with both reduced p -hacking and reduced publication bias.
Economics Letters, May 2024
Causal identification of a student aid program’s impact can be difficult as the best control group is often a small number of out-of-province students who likely differ from locals in unobservable ways. This paper evaluates the impacts of the 30% Off Ontario Tuition Grant using administrative data from the Ontario–Quebec border, where a large number of local students are subject to a different province’s unchanged aid program. The Grant improved access to education; cohorts enrolled after the Grant was announced come from poorer areas, but also achieved lower graduation rates than comparable local yet out-of-province students. I present estimates using three different control groups: a local-student comparison offers the largest results, with more traditional comparisons finding similar but smaller effects.
The Economic Journal, April 2024
(with Abel Brodeur and Carina Neisser)
This paper examines the relationship between p-hacking, publication bias and data-sharing policies. We collect 38,876 test statistics from 1,106 articles published in leading economic journals between 2002–20. We find that, while data-sharing policies increase the provision of data, they do not decrease the extent of p-hacking and publication bias. Similarly, articles that use hard-to-access administrative data or third-party surveys, as compared to those that use easier-to-access (e.g., author-collected) data, are not different in their p-hacking and publication extent. Voluntary provision of data by authors on their home pages offers no evidence of reduced p-hacking.
Journal of the Association of Environmental and Resource Economists, September 2023
(with Anthony Heyes and Nicholas Rivers)
We observe 1.8 million university course grades for 88,959 adults who learn and complete examinations in a much less polluted environment than previously studied. We use a within-student identification strategy and find robust evidence of a negative and causal effect of exam-day outdoor air pollution on course performance. The effect of pollution persists beyond the same-day effect. Female students are more sensitive than males, and effects are greatest when students are engaged in unfamiliar tasks. We explore two margins of adaptation, one infrastructural, one behavioral. Working in a new building, and particularly if it is high quality (LEED Gold), provides significant mitigation. Relocating to a floor above ground level also offers partial protection.
Journal of Environmental Economics and Management, July 2022
(with Anthony Heyes)
While contemporaneous exposure to polluted air has been shown to reduce labor supply and worker productivity, little is known about the underlying mechanisms. We present first causal evidence that psychological exposure to pollution – the “thought of pollution” – can influence employment performance. Over 2000 recruits on a leading micro-task platform are exposed to otherwise identical images of polluted (treated) or unpolluted (control) scenes. Randomization across the geographically-dispersed workforce ensures that treatment is orthogonal to physical pollution exposure. Treated workers are less likely to accept a subsequent offer of work (labor supply) despite being offered a piece-rate much higher than is typical for the setting. Conditional on accepting the offer, treated workers complete between 5.1% to 10.1% less work (labor productivity) depending on the nature of their assigned task. We find no effect on work quality. Suggestive evidence points to the role of induced negative sentiment. Decrements to productivity through psychological mechanisms are plausibly additional to any from physical exposure to polluted air.
Journal of Environmental Economics and Management, March 2021
(with Abel Brodeur and Taylor Wright)
This paper investigates the impacts of COVID-19 safer-at-home polices on collisions and pollution. We find that statewide safer-at-home policies lead to a 20% reduction in vehicular collisions and that the effect is entirely driven by less severe collisions. For pollution, we find particulate matter concentration levels approximately 1.5 μg/m3 lower during the period of a safer-at-home order, representing a 25% reduction. We document a similar reduction in air pollution following the implementation of similar policies in Europe. We calculate that as of the end of June 2020, the benefits from avoided car collisions in the U.S. were approximately $16 billion while the benefits from reduced air pollution could be as high as $13 billion.
American Economic Review, November 2020
(with Abel Brodeur and Anthony Heyes)
The credibility revolution in economics has promoted causal identification using randomized control trials (RCT), difference-in-differences (DID), instrumental variables (IV) and regression discontinuity design (RDD). Applying multiple approaches to over 21,000 hypothesis tests published in 25 leading economics journals, we find that the extent of p-hacking and publication bias varies greatly by method. IV (and to a lesser extent DID) are particularly problematic. We find no evidence that (i) papers published in the Top 5 journals are different to others; (ii) the journal "revise and resubmit" process mitigates the problem; (iii) things are improving through time.
Journal of Environmental Economics and Management, May 2020
(with Anthony Heyes)
We present first evidence that outdoor cold temperatures negatively impact indoor cognitive performance. We use a within-subject design and a large-scale dataset of adults in an incentivized setting. The performance decrement is large despite the subjects working in a fully climate-controlled environment. Using secondary data, we find evidence of partial adaptation at the organizational, individual and biological levels. The results are interpreted in the context of climate models that observe and predict an increase in the frequency of very cold days in some locations (e.g. Chicago) and a decrease in others (e.g. Beijing).
AEA Papers & Proceedings, May 2020
(with Abel Brodeur and Anthony Heyes)
We propose a specification check for p-hacking. More specifically, we advocate the reporting of t-curves and mu-curves—the t-statistics and estimated effect sizes derived from regressions using every possible combination of control variables from the researcher's set—and introduce a standardized and accessible implementation. Our specification check allows researchers, referees, and editors to visually inspect variation in effect sizes, significativity, and sensitivity to the inclusion of control variables. We provide a Stata command that implements the specification check. Given the growing interest in estimating causal effects, the potential applicability of this specification check to empirical studies is large.
Social Studies of Science, October 2016
(with James Williams)
One of the lasting legacies of the financial crisis of 2008, and the legislative energies that followed from it, is the growing reliance on econometrics as part of the rulemaking process. Financial regulators are increasingly expected to rationalize proposed rules using available econometric techniques, and the courts have vacated several key rules emanating from Dodd-Frank on the grounds of alleged deficiencies in this evidentiary effort. The turn toward such econometric tools is seen as a significant constraint on and challenge to regulators as they endeavor to engage with such essential policy questions as the impact of financial speculation on food security. Yet, outside of the specialized practitioner community, very little is known about these techniques. This article examines one such econometric test, Granger causality, and its role in a pivotal Dodd-Frank rulemaking. Through an examination of the test for Granger causality and its attempts to distill the causal connections between financial speculation and commodities prices, the article argues that econometrics is a blunt but useful tool, limited in its ability to provide decisive insights into commodities markets and yet yielding useful returns for those who are able to wield it.