Abstract: Policy makers typically face the problem of wanting to estimate the long-term effects of novel treatments, while only having historical data of older treatment options. We assume access to a long-term dataset where only past treatments were administered and a short-term dataset where novel treatments have been administered. We propose a surrogate based approach where we assume that the long-term effect is channeled through a multitude of available short-term proxies. Our work combines three major recent techniques in the causal machine learning literature: surrogate indices, dynamic treatment effect estimation and double machine learning, in a unified pipeline. We show that our method is consistent and provides root-n asymptotically normal estimates under a Markovian assumption on the data and the observational policy. We use a data-set from a major corporation that includes customer investments over a three year period to create a semi-synthetic data distribution where the major qualitative properties of the real dataset are preserved. We evaluate the performance of our method and discuss practical challenges of deploying our formal methodology and how to address them.
Bios:
Greg Lewis: I am an economist who works on industrial organization, market design, applied econometrics and machine learning. My work is unified by the twin goals of making better sense of microeconomic data, and using those insights to optimize firm decision making and improve market performance. My research has spanned a range of industries – online retailing, online advertising, procurement, electricity, education – and has been published in top economics and management journals.
Vasilis Syrgkanis: I am an Assistant Professor in the Management Science and Engineering Department, in the School of Engineering at Stanford University. I am an active member of the Stanford Operations Research Group, the Statistical Machine Learning Group, the CS Theory Group and affiliated with the Stanford AI Lab (SAIL). My research interests are in the areas of machine learning, causal inference, econometrics, game theory/mechanism design and algorithm design.
Until August 2022, I was a Principal Researcher at Microsoft Research, New England, where I was also co-leading the project on Automated Learning and Intelligence for Causation and Economics (ALICE) and was a member of the EconCS and StatsML groups. I received my Ph.D. in Computer Science from Cornell University, where I had the privilege to be advised by Eva Tardos and then spent two years as a postdoc researcher at Microsoft Research, NYC, as part of the Algorithmic Economics and the Machine Learning groups. I obtained my diploma in EECS at the National Technical University of Athens, Greece.
Summary:
Focus of EconML: https://github.com/microsoft/EconML
Estimation of conditional treatment effects
Given customer attributes, predict outcomes of treatment
A common infrastructure and interface for estimation libraries
Traditional process: Decision maker goes to economist to set up the inference problem and then code up the estimation algorithm
New process:
Decisionmaker goes to data scientist
Data scientist
Sets up problem using DoWhy (https://microsoft.github.io/dowhy) and EconML
Estimates treatment effect using EconML
Challenge:
Automation of estimation procedure and confidence intervals
Post-estimation analyses
Canonical causal treatment model
Outcome_i = f(attributes of unit i that modify the treatment effect) * treatment_i + g(other features of unit i) + noise
One application of EconML
Training the model for individual treatment effects will indicate which attributes of units are most relevant for different sub-groups
E.g. train tree that has branches for different sub-groups
Can’t tell us which data is missing (e.g. if some unobserved attributes are affecting treatment)
Can assume all confounders are observer: good, control for them all
Make sure that if the impact is non-linear, they’re modeled using the correct function
EconML can help with this since it includes many non-linear estimators
EconML includes various algorithms for causal treatment effect estimation, such as:
Double ML
Project treatments and outcomes onto confounders
Then work with the residuals of those models
i.e. train a model to predict treatment/outcome and then focus on deviations of treatment/outcome from expectation
Instrumental Variables
Randomly assigned
Affect treatment
Do not affect outcome independently from treatment
Used to unbias treatments that are applied in biased way
Recommended flow:
Use linear regression or other interpretable models as initial step to help debugging
Then train more flexible models to make sure you flexibly capture the system’s dynamics
Finally, post-process model’s predictions using interpretable models to communicate with stakeholders
Use-case:
Estimation of long-term effects via surrogates
Find some short-term surrogates that indicate long-term behavior
E.g. use historical data to see how much/which history is required to make long-term forecasts (that history is the surrogate)
Challenge: history includes many different interventions, which don’t correspond to the dynamics and treatments we care about
To handle this they explicitly model the process in which the treatment->surrogate->outcome process evolves over time
So they want to understand the effect of one treatment, ignoring the effects of subsequent treatments, which may depend on the current treatment