Tuesday, January 9, 2024
- Speaker: Elizabeth Tipton (Northwestern University)
- Discussant: Andrew Gelman (Columbia University)
- Title: Designing Randomized Trials to Predict Treatment Effects
- Abstract: Typically, a randomized experiment is designed to test a hypothesis about the average treatment effect and sometimes hypotheses about treatment effect variation. The results of such a study may then be used to inform policy and practice for units not in the study. In this paper, we argue that given this use, randomized experiments should instead be designed to predict unit-specific treatment effects in a well-defined population. We then consider how different sampling processes and models affect the bias, variance, and mean squared prediction error of these predictions. To do so, we derive formulas - similar to those in a power analysis - based upon parametric models. The results indicate, for example, that problems of generalizability — differences between samples and populations — can greatly affect bias both in predictive models and in measures of error in these models. We also examine when the average treatment effect estimate outperforms unit-specific treatment effect predictive models and implications of this for planning studies.
[Video] [Slides]
Tuesday, January 16, 2024
- Speakers: Jonas Peters (ETH Zurich), joint with Nicola Gnecco (UCB) and Sorawit Saengkyongam (ETH Zurich)
- Title: On Invariance-based Generalization and Extrapolation
- Abstract: Purely predictive methods may not perform well when the test distribution differs from the training distribution. Consider a setting, where the variations of the distributions can be modeled by a variable $E$ and where we observe some (not necessarily all) values of this variable during training. One may then consider models that are invariant with respect to the different training values of $E$. If there are several of such models, we may further choose an invariant model with optimal predictive performance or one that trades off both of these objectives. Recently, several variations of such methods have been proposed, with the goal to achieve good performance in distribution generalization or extrapolation. In this talk, we (1) try to provide an overview of assumptions that can be considered either sufficient or necessary for obtaining guarantees (here, we focus on worst-case guarantees); we (2) propose the method of boosted control functions and (3) discuss the task of representation for extrapolation, Rep4Ex. Searching for a categorization of assumptions is joint work with Niklas Pfister.
[Video] [Slides]
Tuesday, January 23, 2024
- Speaker: Victor Veitch (University of Chicago)
- Discussant: Francesco Locatello (Institute of Science and Technology Austria)
- Title: Linear Structure of High-Level Concepts in Text-Controlled Generative Models, and the role of Causality
- Abstract: Text controlled generative models (such as large language models or text-to-image diffusion models) operate by embedding natural language into a vector representation, then using this representation to sample from the model's output space. This talk concerns how high-level semantics are encoded in the algebraic structure of representations. In particular, we look at the idea that such representations are ``linear''---what this means, why such structure emerges, and how it can be used for precision understanding and control of generative models.
[Video] [Slides]
Tuesday, January 30, 2024
- Speaker: Sarah Robertson (in place of Issa Dahabreh) (Harvard University)
- Discussant: Paul Zivich (University of North Carolina at Chapel Hill)
- Title: Transporting inferences about intention-to-treat effects and per-protocol effects when there is non-adherence
- Abstract: Transportability or generalizability analyses often attempt to estimate the effect of assignment (i.e., an analog of the intention-to-treat effect) in a target population. However, this effect may not be meaningful if there is imperfect adherence, due to trial engagement effects via adherence (e.g., when there are effects of trial participation on the outcome via treatment receipt and not just through treatment assignment) or selection for participation on the basis of unmeasured factors that influence adherence. When there is non-adherence, it may instead be more realistic to learn about other causal estimands of interest in the target population, such as the effect of joint interventions to scale-up trial activities that affect adherence, or, when additional post-randomization variables are collected, the effect of treatment according to assignment (i.e., analogs of per-protocol effects). We provide examples of causal structures to examine when different causal estimands can be identified in the target population.
Tuesday, February 6, 2024
- Speaker: Fan Yang (Tsinghua University)
- Discussant: Xiaohua (Andrew) Zhou (Peking University)
- Title: Mediation analysis with the mediator and outcome missing not at random
- Abstract: Mediation analysis is widely used for investigating direct and indirect causal pathways through which an effect arises. However, many mediation analysis studies are challenged by missingness in the mediator and outcome. In general, when the mediator and outcome are missing not at random, the direct and indirect effects are not identifiable without further assumptions. In this work, we study the identifiability of the direct and indirect effects under some interpretable mechanisms that allow for missing not at random in the mediator and outcome. We evaluate the performance of statistical inference under those mechanisms through simulation studies and illustrate the proposed methods via the National Job Corps Study.
[Video] [Slides]
Tuesday, February 13, 2024
- Speaker: Ting Ye (University of Washington)
- Title: Debiased Multivariable Mendelian Randomization
- Abstract: Multivariable Mendelian randomization (MVMR) uses genetic variants as instrumental variables to infer the direct effect of multiple exposures on an outcome. Compared to univariable Mendelian randomization, MVMR is less prone to horizontal pleiotropy and enables estimation of the direct effect of each exposure on the outcome. However, MVMR faces greater challenges with weak instruments -- genetic variants that are weakly associated with some exposures conditional on the other exposures. This article focuses on MVMR using summary data from genome-wide association studies (GWAS). We provide a new asymptotic regime to analyze MVMR estimators with many weak instruments, allowing for linear combinations of exposures to have different degrees of instrument strength, and formally show that the popular multivariable inverse-variance weighted (MV-IVW) estimator's asymptotic behavior is highly sensitive to instruments' strength. We then propose a multivariable debiased IVW (MV-dIVW) estimator which effectively reduces the asymptotic bias from weak instruments in MV-IVW, and introduce an adjusted version, MV-adIVW, to improve MV-dIVW's finite-sample robustness. We establish the theoretical properties of our proposed estimators and extend them to handle balanced horizontal pleiotropy. We conclude by demonstrating the performance of our proposed methods in simulated and real datasets. We implement this method in the R package mr.divw. This is a joint work with Yinxiang Wu and Hyunseung Kang.
- Discussant: Neil Davies (UCL)
[Video] [Slides]
Tuesday, February 20, 2024
- Speaker: Iván Díaz (New York University)
- Title: Recanting twins: addressing intermediate confounding in mediation analysis
- Abstract: The presence of intermediate confounders, also called recanting witnesses, is a fundamental challenge to the investigation of causal mechanisms in mediation analysis, preventing the identification of natural path-specific effects. Proposed alternative parameters (such as randomized interventional effects) are problematic because they can be non-null even when there is no mediation for any individual in the population; i.e., they are not an average of underlying individual-level mechanisms. We develop a novel method for mediation analysis in settings with intermediate confounding, with guarantees that the causal parameters are summaries of the individual-level mechanisms of interest. The method is based on recently proposed ideas that view causality as the transfer of information, and thus replace recanting witnesses by draws from their conditional distribution, what we call "recanting twins". We show that, in the absence of intermediate confounding, recanting twin effects recover natural path-specific effects. We present the assumptions required for identification of recanting twins effects under a standard structural causal model, as well as the assumptions under which the recanting twin identification formulas can be interpreted in the context of the recently proposed separable effects models. To estimate recanting-twin effects, we develop efficient semi-parametric estimators that allow the use of data driven methods in the estimation of the nuisance parameters. We present numerical studies of the methods using synthetic data, as well as an application to evaluate the role of new-onset anxiety and depressive disorder in explaining the relationship between gabapentin/pregabalin prescription and incident opioid use disorder among Medicaid beneficiaries with chronic pain.
- Discussant: Daniel Malinsky (Columbia University)
[Video] [Slides] [Discussant slides]
Tuesday, February 27, 2024
- Speaker: Maria Glymour (Boston University)
- Title: Evidence triangulation in dementia research
- Abstract: Research on cognitive aging, including development of neurodegenerative diseases such as Alzheimer's, is fraught with causal inference challenges. This talk will briefly review why identifying the causes and potential prevention strategies for dementia is particularly challenging. I will discuss a framework for evidence triangulation and offer examples of promising approaches in dementia research.
- Discussant: George Davey Smith (University of Bristol)
[Video]
Tuesday, March 5, 2024 (young researcher seminar)
- Speaker 1: Xinwei Shen (ETH Zurich)
- Title: Causality-oriented robustness: exploiting data heterogeneity at different levels
- Abstract: Since distribution shifts are common in real-world applications, there is a pressing need for developing prediction models that are robust against such shifts. Unlike empirical risk minimization or distributionally robust optimization, causality offers a data-driven and structural perspective to robust prediction. In this talk, we discuss causality-oriented robust prediction by exploiting heterogeneity in multi-environment training data at different levels. Previous work such as anchor regression has mainly studied mean shifts, while we propose Distributional Robustness via Invariant Gradients (DRIG), a method that exploits variance shifts induced by general additive interventions for robust prediction against more diverse unseen interventions. Finally, we discuss an idea to go beyond specific characteristics but exploit shifts in overall aspects of the distribution, thus leading to potentially more robust predictions. The proposed methods are validated on a single-cell data application.
[Video] [Slides]
- Speaker 2: Giulio Grossi (University of Florence)
- Title: SMaC: Spatial Matrix Completion method
- Abstract: Synthetic control methods are commonly used in panel data settings to evaluate the effect of an intervention. In many of these cases, the treated and control time series correspond to spatial areas such as regions or neighborhoods. We work in a setting where a treatment is applied at a given location and its effect can emanate across space. Then, an area of a certain size around the intervention point is considered to be the treated area. Synthetic control methods can be used to evaluate the effect that the treatment had in the treated area, but it is often unclear how far the treatment’s effect propagates. Therefore, researchers might consider treated areas of different sizes and apply synthetic control methods separately for each one of them. However, this approach ignores the spatial structure of the data and can lead to efficiency loss in spatial settings. We propose to deal with these issues by developing a Bayesian spatial matrix completion framework that allows us to predict the missing potential outcomes in the areas of different size around the intervention point while accounting for the spatial structure of the data. Specifically, the missing time series in the absence of treatment for the treated areas of all sizes are imputed using a weighted average of control time series, where the weights assigned to each control unit are assumed to vary smoothly over space according to a Gaussian process. Our motivating application is the construction of the first line of the Florentine tramway, which could have affected the prevalence of businesses in the neighbourhood of the construction site and at various distances from the tramway stops.
[Video] [Slides]
Tuesday, March 12, 2024
- Speaker: David Lagnado (UCL)
- Title: Causality in Mind: Learning, Reasoning and Blaming
- Abstract: Knowledge of cause and effect is vital to our ability to predict, control and explain the world. It helps us diagnose diseases, build bridges and decide guilt. How do people learn and reason about causality? This talk will focus on three key areas of cognition: (1) Learning: how we construct causal models by actively exploring the world, using heuristics to overcome cognitive limitations; (2) Reasoning: how we use causal models to explain uncertain evidence, simplifying complex inferences; (3) Attribution: how we rely on causal counterfactuals to assign responsibility and blame.
- Discussant: Neil Bramley (University of Edinburgh)
[Video] [Slides]
Tuesday, March 19, 2024:
- Speaker: Sara Magliacane (University of Amsterdam, MIT-IBM Watson AI Lab), Phillip Lippe (University of Amsterdam)
- Title: BISCUIT: Causal Representation Learning from Binary Interactions
- Abstract: Identifying the causal variables of an environment and how to intervene on them is of core value in applications such as robotics and embodied AI. While an agent can commonly interact with the environment and may implicitly perturb the behavior of some of these causal variables, often the targets it affects remain unknown. In this talk, we show that causal variables can still be identified for many common setups, e.g., additive Gaussian noise models, if the agent's interactions with a causal variable can be described by an unknown binary variable. This happens when each causal variable has two different mechanisms, e.g., an observational and an interventional one. Using this identifiability result, we propose BISCUIT, a method for simultaneously learning causal variables and their corresponding binary interaction variables. On three robotic-inspired datasets, BISCUIT accurately identifies causal variables and can even be scaled to complex, realistic environments for embodied AI.
- Discussant: Sébastien Lachapelle (Samsung SAIL)
[Video] [Slides] [Part-2 slides]
Tuesday, March 26, 2024: Krikamol Muandet (CISPA)
- Title: A Measure-Theoretic Axiomatisation of Causality
- Abstract: Causality is a central concept in a wide range of research areas, yet there is still no universally agreed axiomatisation of causality. We view causality both as an extension of probability theory and as a study of \textit{what happens when one intervenes on a system}, and argue in favour of taking Kolmogorov's measure-theoretic axiomatisation of probability as the starting point towards an axiomatisation of causality. To that end, we propose the notion of a \textit{causal space}, consisting of a probability space along with a collection of transition probability kernels, called \textit{causal kernels}, that encode the causal information of the space. Our proposed framework is not only rigorously grounded in measure theory, but it also sheds light on long-standing limitations of existing frameworks including, for example, cycles, latent variables and stochastic processes.
- Discussant: Ricardo Silva (UCL)
- Q&A moderator: Junhyung Park (Max Planck Institute)
[Video] [Slides] [Paper]