Scientists and policymakers evaluating a causal claim in a population are rarely interested exclusively in the specific populations they study. Rather, the attention devoted to a given subject population is justified by the hope that test results can be generalized to distinct populations. One looks at the effects of an educational policy in one school district with the hope of learning about its effectiveness in others. Drug testers evaluate the members of a trial population on the assumption that they are similar to future consumers of the drug. The existing literature on causal inference focuses on establishing the existence of causal relations in single populations and largely neglects the question of how one can extrapolate a causal claim from a study population to a target population. This dissertation aims to fill this gap in the literature.
Consider an educational intervention intended to raise SAT scores by reducing class size. The intervention is implemented in one city’s school system and it raises the average score. Will the effect of the intervention on scores generalize to other cities? This depends on how the cities differ. I begin by distinguishing between two ways that an observed population may differ from unobserved populations.
First, cities may differ in factors that influence class size. Provided that these factors only influence class size, such variation is unproblematic. This is because the causal relation between two variables C and E is not sensitive to changes that only influence C. This is a standard assumption of causal modeling techniques and I argue it can be independently motivated.
Second, cities may differ in background factors that influence SAT scores. For example, cities may differ in levels of parental education. This problem is especially difficult since there can be many such background factors, any of which can influence the effect of class size on education. I consider the three best-developed accounts of extrapolation – those of Nancy Cartwright and Jeremy Hardie, Daniel Steel, and Judea Pearl and Elias Bareinboim – and show that none of them are able to resolve this problem.
I present two strategies for addressing variation in background factors. First, I argue that prior accounts have been limited by their assuming that extrapolation is a deductive inference problem. That is, they consider whether premises about how populations differ entail that a (type-level) causal relation obtains in the unobserved population. I argue that any approach to extrapolation must also allow for inductive inferences that tell one when discovering a causal relation in a population counts as evidence for its existence in other populations. I show how these inferences can be represented within existing causal modeling frameworks.
My second strategy for addressing variation in background factors is to measure variables that are causally intermediate between class size and SAT scores. These variables are called mediators. Perhaps reduced class size increases SAT scores in part by causing students to spend more time on homework. If so, then the intervention on class size might not work as well in cities where students devote their afterschool time to sporting events rather than homework. Measuring a mediator allows one to distinguish between the variation due to factors influencing the mediator and the variation due to factors influencing the effect variable (SAT scores). Measuring mediators therefore enables one to make more reliable cross-population predictions.
In the process of providing an account of how extrapolation inferences can be justified, I also address several topics of concern to philosophers and scientists. In one chapter, I show how the thesis that the effect of C on E is invariant to changes in C has important consequences for a debate about whether the psychological variable of intelligence counts as a cause. In another, I challenge a presupposition of recent mechanistic theories of explanation. According to these theories, mechanistic phenomena in biology and neuroscience cannot be fully explained using only causal relations. I invoke recently developed techniques for measuring the contributions of mediators to an effect in order to argue against various reasons for thinking that such phenomena call for non-causal explanation. Moreover, unlike existing theories of mechanisms, these techniques license quantitative predictions about whether a mechanism will continue to function across contexts.
The question of when a causal relationship may be extrapolated across populations is of immense practical importance. This question cannot be answered without considering foundational philosophical questions about causal explanation and causal inference. In this dissertation, I provide a precise characterization of the challenges surrounding extrapolation and present novel strategies for addressing them.
An example of a causal relation that may differ among populations
Populations may vary in background factors
My second strategy: By measuring mediators, one can distinguish between the variation due to factors that influence the mediator and those that influence the effect variable
N.B: The direct arrow from "Class Size" to "SAT scores" indicates that "Hours on Homework" is not the only mediator between these variables.