Por Que No Los Dos

In a previous blog post I used the made-up data provided in Scott Cunningham's excellent book, Causal Inference: The mixed tape. The post goes through what happens when you have data from RCTs and what happens when you have observational data. Both are standard questions with standard answers. But it gets one thinking, why not both?

Since Rubin (1974) we have been arguing about which is better, RCT data or observational data. Outside of economics, it is argued that RCT data is the gold standard. Economists snarkly point out that that the gold standard is not even the gold standard.

It has always been suggested that these are substitutes. But, as my son taught me, "Por que no los dos?"

Nobel Laureates Go Toe To Toe

The Nobel committee awarded Duflo, Banerjee and Kremer the prize for their work on using RCTs in analyzing anti-poverty policies in the developing world. Separately, the committee awarded Deaton the prize for his work. Although, not for his efforts explicitly critiquing the use of RCTs in analyzing anti-poverty policies in the developing world.

The Equation That Destroyed The World

Wired magazine article: https://www.wired.com/2009/02/wp-quant/

The copula is blamed for creating the financial crisis, and it may well have. But it is just a formula. The copula is a just a formula for relating joint probabilities to marginal probabilities. If we know the marginal probabilities and we know the copula, then we know the joint distribution. This formula holds the key for thinking about linking RCT data with observational data.

RCT data can provide us with the marginal probabilities. From each arm we can learn the probability of surviving a certain length of time given a particular cancer treatment. As discussed in the previous blog post we can learn something about the joint probabilities from observational data. To do this we need to have both observational data and a willingness to make an assumption about how individuals are selected into the treatment. Scott assumes that there is an omniscient doctor who allocates patients to the treatment the greatest longevity.

Improving on Frechet-Hoeffding Bounds

Can combining information from RCTs and observational data actually help? Yes. A nice paper in the Annals of Applied Probability by Lux and Papapantoleon shows that the combination tightens the Frechet-Hoeffding bounds.

Consider the probability that any patient passes prior to 4 years after treatment. This is a joint probability. It is the probability of not surviving past 4 years on either chemo or surgery. The F-H bounds on this joint probability can be determined using the table on potential outcomes presented in Scott's book. The probability of surviving 4 years on chemo is 0.4, while it is 0.3 for surgery. The F-H bounds are then [max(-1 + 0.4 +3, 0), min(0.3, 0.4)] = [0, 0.3].

Can we do better than this with the observational data? Yes. There is one patient who is assigned surgery but lives 4 years. So the Pr(3, 4) = 0.1. In addition there is another patient who lives 5 years after begin assigned surgery. So Pr(4, 5) = 0.2. Now, it must be that Pr(3, 4) < Pr(4,4) < Pr(4,5). So our bounds are [0.1, 0.2] which is a strict subset of [0, 0.3]. Now this doesn't seem that interesting until you realize that we know that Pr(Y_0 < 4) = 0.4, which tells us that at least 20% of patients live shorter than 4 years on chemo and LONGER than 4 years on surgery. Similarly Pr(Y_1 < 4) = 0.3, so at least 10% of patients live shorter than 4 years on chemo and longer than 4 years on surgery.

We can argue about whether RCTs are better than observational data or we can combine the information learned from each to learn more about the joint-distribution of treatment effects. Por que no los dos?