Machine Learning Methods for Causal Inference in Clustered Observational Studies

Hyunseung Kang, University of Wisconsin-Madison

Video Recording

Abstract:

There is a growing interest in using machine learning (ML) methods for causal inference due to their (nearly) automatic and flexible ability to model key quantities such as the propensity score or the outcome model. Unfortunately, most ML methods for causal inference have been studied under single-level/"i.i.d." settings where all individuals are independent of each other and there is little work in using these methods with clustered, nested, or hierarchical data, a common setting in the social sciences. This talk investigates using existing ML methods to estimate treatment effects in multilevel observational data. We present (1) simple re-tuning/re-fitting strategies for existing ML methods initially designed for i.i.d. data to handle clustering and (2) new ways to use ML methods to deal with cluster-level unmeasured confounders. We demonstrate our methods in simulation studies and large-scale education assessment studies. This is joint work with Youmi Suk (UW-Madison).


Bio:

Hyunseung Kang (pronounced Hun-Sung) is currently an assistant professor in the Department of Statistics at the University of Wisconsin-Madison. Hyunseung completed his postdoc in Economics at the Stanford Graduate School of Business and received his Ph.D. in Statistics at the Wharton School of Business of the University of Pennsylvania. His research focuses on developing theory and methods to analyze causal relationships in large observational studies and networks by leveraging instrumental variables, econometrics, and nonparametric methods.