Causal Inference in the Age of Big Data

Abstract:
The rise of massive datasets that provide fine-grained information about human beings and their behavior offers unprecedented opportunities for evaluating the effectiveness of social, behavioral, and medical treatments. With the availability of fine-grained data, researchers and policymakers are increasingly unsatisfied with estimates of average treatment effects based on experimental samples that are unrepresentative of populations of interest. Instead, they seek to target treatments to particular populations and subgroups. Because of these inferential challenges, Machine Learning (ML) is now being used for evaluating and predicting the effectiveness of interventions in a wide range of domains from technology firms to clinical medicine and election campaigns. However, there are a number of issues that arise with the use of ML for causal inference. For example, although ML and related statistical models are good for prediction, they are not designed to estimate causal effects. Instead, they focus on predicting observed outcomes. In this talk, a number of meta-algorithms are presented that can take advantage of any supervised learning method to estimate the Conditional Average Treatment Effect function. Also discussed are new theoretical results on confidence intervals and overlap in high-dimensional covariates and a new algorithm for optimal linear aggregation functions for tree-based estimators.

Bio:
Jasjeet Sekhon is the Eugene Meyer Professor of Statistics and Data Science and Professor of Political Science at Yale University. He is also the Head of Advanced Data Science at Bridgewater Associates. He has conducted research on causal inference, machine learning, experimental design, and has worked on applications across the social sciences, including political science, economics, and epidemiology. His current research focuses on developing interpretable and credible machine learning methods for estimating causal relationships. Before Yale, he was the Robson Professor of Statistics and Political Science at UC Berkeley. He also has extensive industry experience working with both technology and finance firms. 

Summary: