The Neglected Assumptions in Causal Inference

Workshop @ ICML 2021 - fully virtual - July 23

Virtual Workshop Site (Requires ICML Workshop Registration)

Overview

As causality enjoys increasing attention in various areas of machine learning, this workshop turns the spotlight on the assumptions behind the successful application of causal inference techniques. It is well known that answering causal queries from observational data requires strong and often untestable assumptions.

On the theoretical side, a whole host of settings has been established in which causal effects are identifiable and consistently estimable under a set of by now considered "standard" assumptions. While these can be reasonable in specific scenarios, they were often at least partially motivated by rendering estimation theoretically feasible. Unfortunately, in applications we often find them taken for granted without further scrutiny. Terms like stable unit treatment value assumption, ignorability, no interference, faithfulness, positivity, overlap, no unobserved confounding, additive noise, or linear structural equations are sprinkled throughout papers without further questioning.

This situation may lead practitioners to either believe that well-founded causal inference is unattainable altogether, or that off-the-shelf methods can be trusted to deliver reliable causal estimates in virtually any situation. Similarly, as ideas from causality are increasingly picked up by researchers in deep-, reinforcement-, or meta-learning, there is a risk that the role of assumptions for causal inference gets lost in translation.

This workshop will bring together researchers and practitioners from across the disciplines to discuss and debate the often neglected assumptions for causal inference. The one-day workshop will feature invited talks, contributed talks, a poster session, as well as a panel discussion. Contributions of theoretical, methodological, and applied nature are encouraged! All are welcome to attend!

Call for papers

Invited Talks

Noa Daga & Noam Barda
Harvard Medical School

Title: Causal Inference to decipher COVID-19 vaccine effectiveness in a real-world setting
Abstract: Following its emergency use authorization and the initiation of large-scale vaccination campaigns, there was a need to evaluate the real-world effectiveness of the Pfizer BNT162b2 mRNA Covid-19 Vaccine. To do so, we made use of observational data from Israel’s largest health care organization. Eventually we estimated that the vaccine is 94% effective in preventing symptomatic disease and 92% in preventing severe disease. To make this estimation, we had to tackle the identifiability assumptions of causal inference and their violations as they unfolded over a period of several weeks in Israel. This talk will discuss the study and the design used to perform the inference.

Frederick Eberhardt
Caltech

Title: Practical Hurdles in Causal Structure Discovery
Abstract: There is a significant disconnect between the sophistication of current causal discovery methods and their application to actual data. While there is now a very thorough theoretical understanding of what can and cannot be discovered from observational data given a broad variety of different background assumptions, the application of the causal structure discovery methods often suffers from much more elementary and mundane hurdles. Using the example of causal structure discovery in neuroscientific data I aim to highlight the need for much more "basic causal science" if we want to achieve any sort of real breakthrough in the broad application of causal discovery methods to statistical data.

Lihua Lei & Avi Feller
Stanford and UC Berkeley

Title: Distribution-Free Assessment of Population Overlap in Observational Studies
Abstract: Overlap in baseline covariates between treated and control groups, also known as positivity or common support, is a common assumption in observational causal inference. Assessing this assumption is often ad hoc, however, and can give misleading results. For example, the common practice of examining the empirical distribution of estimated propensity scores is heavily dependent on model specification and has poor uncertainty quantification. In this paper, we propose a formal statistical framework for assessing the extrema of the population propensity score; e.g., the propensity score lies in [0.1, 0.9] almost surely. We develop a family of upper confidence bounds, which we term O-values, for this quantity. We show these bounds are valid in finite samples so long as the observations are independent and identically distributed, without requiring any further modeling assumptions on the data generating process. We also use extensive simulations to show that these bounds are reasonably tight in practice. Finally, we demonstrate this approach using several benchmark observational studies, showing how to build our proposed method into the observational causal inference workflow.

Daniel Malinsky
Columbia

Title: Causality, Interference, and Network Learning
Abstract: In some settings, especially in the study of infectious diseases and social-behavioral interventions, phenomena related to interference are of central importance. The basic issue is that the effects of interventions or causal determinants may "spread" or "spill-over" between units, but there are many distinct mechanisms that may give rise to interference generally. Typically, there is only limited information about social networks and social dynamics available, and most statistical techniques that accommodate interference make substantive assumptions about both which units are connected and how they may interact. We present an approach that combines techniques from causal graphical structure learning (causal discovery) and causal inference under interference, using the data to distinguish between possible interference models. Our approach is grounded in score-based model selection of chain graphs in a "partial interference" setting. We also discuss open problems and some possible approaches that might be explored further. This is based on joint work with Rohit Bhattacharya and Ilya Shpitser.

Lina Montoya
UNC Chapel Hill

Title: Optimal Dynamic Treatment Rule Estimation and Evaluation with Application to Criminal Justice Interventions in the United States
Abstract: The optimal dynamic treatment rule (ODTR) framework offers an approach for understanding which kinds of individuals respond best to specific interventions. Recently, there has been a proliferation of methods for estimating the ODTR. One such method is an extension of the SuperLearner algorithm – an ensemble method to optimally combine candidate algorithms extensively used in prediction problems – to ODTRs. Following the "Causal Roadmap", in this talk we causally and statistically define the ODTR, and different parameters to evaluate it. We show how to estimate the ODTR with SuperLearner and evaluate it using cross-validated targeted maximum likelihood estimation. We apply the ODTR SuperLearner to the "Interventions" study, a randomized trial that is currently underway aimed at reducing recidivism among justice-involved adults with mental illness in the United States. Specifically, we show preliminary results for the ODTR SuperLearner applied to this data, which aims to learn for whom Cognitive Behavioral Therapy (CBT) treatment works best to reduce recidivism, instead of Treatment As Usual (TAU; psychiatric services). This is joint work with Drs. Maya Petersen, Mark van der Laan, and Jennifer Skeem.

Margarita Moreno-Betancur
University of Melbourne & Murdoch Children's Research Institute

Title: Causal machine learning methods for high-dimensional mediation analysis: Application to a randomised trial of tuberculosis vaccination
Abstract: Statistical methods for causal mediation analysis are useful for understanding the pathways by which a treatment or exposure impacts health outcomes. While there have been many methodological developments in the past decades, there is still a scarcity of feasible and flexible, data-adaptive methods for mediation analysis with respect to high-dimensional mediators (e.g., biomarkers) and confounders. Existing methods necessitate modelling of the distribution of the mediators, which quickly becomes infeasible when mediators are high-dimensional. To avoid such high-dimensional modelling, we propose causal machine learning methods for estimating the indirect effect of a randomised treatment that acts via a pathway represented by a high-dimensional set of measurements. The proposed methods are doubly robust, enabling (uniformly) valid statistical inference when using machine learning algorithms for the two required models. This work was motivated by the Melbourne Infant Study: BCG for Allergy and Infection Reduction (MIS BAIR), a randomised controlled trial investigating the effect of neonatal Bacillus Calmette–Guérin (BCG) (tuberculosis) vaccination on allergy and infection outcomes in the first years of life. The hypothesis of the trial was that the heterologous effects of BCG on innate immunity have beneficial effects on the developing immune system, resulting in improved outcomes. We illustrate the methods in the investigation of this hypothesis, where immune pathways are represented by a high-dimensional vector of cytokine responses under various stimulants. We moreover study the performance of the methods in an extensive simulation study that is closely based on this example to empirically evaluate the methods in a realistic setting.


Panel

Organizers

Laura Balzer
UMass Amherst

Alexander D'Amour
Google Brain

Lily Hu
Harvard

Niki Kilbertus
Helmholtz AI

Razieh Nabi
Emory University

Uri Shalit
Technion

Title Photo by Alfonso Ninguno on UnsplashPage Logo created by Victoruler from Noun Project