Winter 2022 complete list with abstracts
Tuesday, March 15, 2022: Chengchun Shi (LSE)
- A reinforcement learning framework for dynamic causal effects evaluation in A/B testing
- Discussant: Will Wei Sun (Purdue University)
- Abstract: A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. Major challenges arise in online experiments of two-sided marketplace platforms (e.g., Uber) where there is only one unit that receives a sequence of treatments over time. In those experiments, the treatment at a given time impacts current outcome as well as future outcomes. In this talk, we introduce a reinforcement learning framework for carrying A/B testing in these experiments, while characterizing the long-term treatment effects. Our proposed testing procedure allows for sequential monitoring and online updating. It is generally applicable to a variety of treatment designs in different industries. In addition, we systematically investigate the theoretical properties of our testing procedure. Finally, we apply our framework to both simulated data and a real-world data example obtained from a ridesharing company to illustrate its advantage over the current practice.
[Video] [Slides] [Discussant slides]Tuesday, March 8, 2022: Yuansi Chen (Duke University)
- Domain adaptation under structural causal models
- Discussant: Biwei Huang (CMU)
- Abstract: Domain adaptation (DA) arises as an important problem in statistical machine learning when the source data used to train a model is different from the target data used to test the model. Recent advances in DA have mainly been application-driven and have largely relied on the idea of a common subspace for source and target data. To understand the empirical successes and failures of DA methods, we propose a theoretical framework via structural causal models that enables analysis and comparison of the prediction performance of DA methods. This framework also allows us to itemize the assumptions needed for the DA methods to have a low target error. Additionally, with insights from our theory, we propose a new DA method called CIRM that outperforms existing DA methods when both the covariates and label distributions are perturbed in the target data. We complement the theoretical analysis with extensive simulations to show the necessity of the devised assumptions. Reproducible synthetic and real data experiments are also provided to illustrate the strengths and weaknesses of DA methods when parts of the assumptions in our theory are violated.
[Video] [Slides] [Discussion slides] [Paper]Tuesday, March 1, 2022: Kosuke Imai (Harvard University)
- Safe Policy Learning through Extrapolation: Application to Pre-trial Risk Assessment
- Discussant: Yifan Cui (National University of Singapore)
- Abstract: Algorithmic recommendations and decisions have become ubiquitous in today’s society. Many of these and other data-driven policies, especially in the realm of public policy, are based on known, deterministic rules to ensure their transparency and interpretability. For example, algorithmic pre-trial risk assessments, which serve as our motivating application, provide relatively simple, deterministic classification scores and recommendations to help judges make release decisions. How can we use the data based on existing deterministic policies and learn new and better policies? Unfortunately, prior methods for policy learning are not applicable because they require existing policies to be stochastic rather than deterministic. We develop a robust optimization approach that partially identifies the expected utility of a policy, and then finds an optimal policy by minimizing the worst-case regret. The resulting policy is conservative but has a statistical safety guarantee, allowing the policy-maker to limit the probability of producing a worse outcome than the existing policy. We extend this approach to common and important settings where humans make decisions with the aid of algorithmic recommendations. Lastly, we apply the proposed methodology to a unique field experiment on pre-trial risk assessment instruments. We derive new classification and recommendation rules that retain the transparency and interpretability of the existing instrument while potentially leading to better overall outcomes at a lower cost.
[Video] [Slides] [Discussant slides] [Paper]Tuesday, February 22, 2022: Dominik Rothenhäusler (Stanford University)
- Calibrated inference: statistical inference that accounts for both sampling uncertainty and distributional uncertainty
- Discussant: Guido Imbens (Stanford University)
- Abstract: During data analysis, analysts often have to make seemingly arbitrary decisions. For example during data pre-processing, there are a variety of options for dealing with outliers or inferring missing data. Similarly, many specifications and methods can be reasonable to address a certain domain question. This may be seen as a hindrance to reliable inference since conclusions can change depending on the analyst's choices.
In this paper, we argue that this situation is an opportunity to construct confidence intervals that account not only for sampling uncertainty but also some type of distributional uncertainty. Distributional uncertainty is closely related to other issues in data analysis, ranging from dependence between observations to selection bias and confounding. We demonstrate the utility of the approach on simulated and real-world data.
This is joint work with Yujin Jeong.
[Video] [Slides]Tuesday, February 15, 2022: Luke Keele (University of Pennsylvania)
- So Many Choices: The Comparative Performance of Statistical Adjustment Methods
- Discussant: Iván Díaz (Cornell University)
- Abstract: Much evidence in applied research is based on observational studies where investigators assume that there are no unobservable differences between the groups under comparison. Treatment effects are estimated after adjusting for observed confounders via statistical methods. However, even if the assumption of no unobserved confounding holds, bias from model misspecification may be significant. Traditionally, regression models of various kinds have been used to adjust for confounders. Such models impose strong functional form assumptions that are most prone to model misspecification. In the causal inference literature, there has been considerable effort on the development of more flexible adjustment methods. In fact, there has been an explosion in the number of methods that can be used to adjust for observed confounders. Now investigators can choose between many forms of matching, weighting, doubly robust methods, and a variety of machine learning based estimators. The general trend has been to move toward flexible methods of estimation. Specifically, most recent work has sought to combine methods from machine learning with a doubly robust framework. While these methods have clear theoretical advantages, they see little use in the applied literature. Moreover, the development of guidelines for applied researchers has been limited. In this presentation, I review key concepts related to functional form assumptions and how those can contribute to bias from model misspecification. I also review the logic behind why machine learning methods have been so widely proposed for estimation and review the strengths and weaknesses of these methods. I present two case studies where I seek to recover experimental benchmarks using observational study data. In these case studies, I compare the performance of a wide variety of methods for statistical adjustment. I find that several widely used methods are subject to bias from model misspecification. I also find that while machine learning methods are among the strongest performers, they are not always reliable.
[Video] [Slides] [Discussant slides]Tuesday, February 8, 2022: Zhimei Ren (University of Chicago)
- Sensitivity Analysis of Individual Treatment Effects: A Robust Conformal Inference Approach
- Discussant: Stefan Wager (Stanford University)
- Abstract: We propose a model-free framework for sensitivity analysis of individual treatment effects (ITEs), building upon ideas from conformal inference. For any unit, our procedure reports the Gamma-value, a number which quantifies the minimum strength of confounding needed to explain away the evidence for ITE. Our approach rests on the reliable predictive inference of counterfactuals and ITEs in situations where the training data is confounded. Under the marginal sensitivity model of Tan (2006), we characterize the shift between the distribution of the observations and that of the counterfactuals. We first develop a general method for predictive inference of test samples from a shifted distribution; we then leverage this to construct covariate-dependent prediction sets for counterfactuals. No matter the value of the shift, these prediction sets (resp. approximately) achieve marginal coverage if the propensity score is known exactly (resp. estimated). We describe a distinct procedure also attaining coverage, however, conditional on the training data. In the latter case, we prove a sharpness result showing that for certain classes of prediction problems, the prediction intervals cannot possibly be tightened. We verify the validity and performance of the new methods via simulation studies and apply them to analyze real datasets.
[Video] [Slides] [Discussant slides] [Paper]Tuesday, February 1 , 2022: Sander Beckers (University of Tübingen)
- Causal Sufficiency and Actual Causation
- Discussant: Thomas Icard (Stanford University)
- Abstract: Pearl opened the door to formally defining actual causation using causal models. His approach rests on two strategies: first, capturing the widespread intuition that X = x causesY =y iff X=x is a Necessary Element of a Sufficient Set for Y =y, and second, showing that his definition gives intuitive answers on a wide set of problem cases. This inspired dozens of variations of his definition of actual causation, the most prominent of which are due to Halpern & Pearl. Yet all of them ignore Pearl’s first strategy, and the second strategy taken by itself is unable to deliver a consensus. This paper offers a way out by going back to the first strategy: it offers six formal definitions of causal sufficiency and two interpretations of necessity. Combining the two gives twelve new definitions of actual causation. Several interesting results about these definitions and their relation to the various Halpern & Pearl definitions are presented. Afterwards the second strategy is evaluated as well. In order to maximize neutrality, the paper relies mostly on the examples and intuitions of Halpern & Pearl. One definition comes out as being superior to all others, and is therefore suggested as a new definition of actual causation.
[Video] [Slides] [Discussant slides] [Paper]Tuesday, January 25, 2022: Daniel McCaffrey (ETS)
- Nonrandom Samples and Causal Inference
- Discussant: Shu Yang (North Carolina State University)
- Abstract: Causal inferences, i.e., estimates of how a treatment or intervention affects outcomes, are of great interest in many fields. There are many causal modeling methods for estimating causal effects from observational data that attempt to adjust for potential biases due to the differences between individuals receiving different treatments in natural settings. These methods nearly universally implicitly assume the data are a random sample from the population of interest. Nonrandom samples occur when using survey samples or because of nonresponse or study attrition. Weighting commonly is proposed for adjusting the nonrandom samples to be representative of the population of interest. There are questions about how to use these sampling (or nonresponse or attrition) weights with causal modeling techniques. Authors have explored the issue but the advice is somewhat conflicting. In this talk, I will demonstrate that under certain assumptions combining sampling (or nonresponse or attrition) weights with inverse-probability-of-treatment weights can yield consistent estimates of causal effects for the entire population of interest. I show that different assumptions lead to different recommendations for how to use the sampling (or nonresponse or attrition) weights. I also show that using simulate data to study how to use such weights without a theoretical foundation can lead to confusing conclusions.
[Video] [Slides] [Discussant slides]Tuesday, January 18, 2022: Sach Mukherjee (University of Cambridge)
- A machine learning approach for causal structure estimation in high dimensions
- Discussant: Yuhao Wang (Tsinghua University)
- Abstract: Causal structure learning refers to the task of estimating graphical structures encoding causal relationships between variables. This remains challenging, especially under conditions of high dimensionality, latent variables and noisy, finite data, as seen in many real world applications. I will discuss our recent efforts to reframe specific aspects of causal structure learning from a machine learning perspective. The approaches I will discuss differ from classical structure learning tools in that rather than trying to establish a model of the data-generating process, they focus on minimizing a certain expected loss defined with respect to the causal structure of interest. The work is motivated by applications in high-dimensional molecular biology, and I will show empirical examples in which model-based predictions can be tested at large scale against experimental results.
[Discussant slides]Tuesday, January 11, 2022: Interview with Guido Imbens (Stanford University)
[Video]