This one-day workshop will feature recent advances in causal inference broadly at the intersection of statistics, operations research, and engineering. The workshop aims to combine theory and applications and bring together researchers across academia and industry. The event will be organized in such a way to promote meaningful interactions and discussions, and foster interdisciplinary collaboration, through a mix of talks, poster presentations, and networking breaks.
As part of the workshop, we will be accepting short papers that are broadly at the intersection of causal inference, engineering, and operations research. This will be non-archival and accepted papers will be presented as posters along with short spotlight talks. Papers should be submitted via the following google form. Authors are asked to submit extended abstracts that are at most 4-pages in length. Please use the standard PER format (http://www.sigmetrics.org/sig-alternate-per.cls). There will also be a poster session for accepted papers.
If there are any questions, please contact cleeyu at cornell dot edu.
9:00-9:15am: Opening remarks
9:15-9:45am: Speaker session
Christina Lee Yu: Considerations in Designing Experiments under Network Interference
9:50-10:30am: Poster flash talks
Dwaipayan Saha: Synthetic Blip Effects: Generalizing Synthetic Controls for the Dynamic Treatment Regime
Jacob Feitelberg: Distributional Matrix Completion via Nearest Neighbors in the Wasserstein Space
Yassir Jedra: K-SVD with Gradient Descent
Jia Wan: Exploiting Exogenous Structure for Sample-Efficient Reinforcement Learning
Sadegh Shirani: Experimentation under Unknown Network Interference
Vydhourie Thiyageswaran: Optimal Design under Interference, Homophily, and Robustness Tradeoffs
Su Jia: Clustered Switchback Experiments
Sahil Loomba: Off-Policy Causal Estimation under Network Interference
Jessy Xinyi Han: Fairness is More Than Algorithms: Racial Disparities in Recidivism
Kan Xu: Match Made with Matrix Completion: Efficient Offline and Online Learning in Matching Markets
10:30-11:00am: Coffee break
11:00-12:30pm: Poster session
12:30-1:30pm: Lunch break
1:30-3:00pm: Speaker session
Jean Pouget-Abadie: Bootstrapping and Differentially Private Experimentation at Tech companies
Nathan Kallus: Learning Surrogate Indices from Historical A/Bs: Adversarial ML for Debiased Inference on Functionals of Ill-Posed Inverses
Raaz Dwivedi: AI Agents for Root Cause Analysis
3:00-3:30pm: Coffee break
3:30-5:00pm: Speaker session
Hannah Li: Randomized Controlled Trials of Service Interventions: The Impact of Capacity Constraints
Abdullah Alomar: Understanding Price Elasticity with Causal Inference using Observational Data: A Case Study
Dennis Shen: The Ordinary Least Squares Interpolator for Econometrics
(Google NYC)
Bootstrapping and Differentially Private Experimentation at Tech companies
We'll cover two experimentation topics motivated by real-life problems we have faced at a large tech company. We first propose a general integer programming approach to better estimate the uncertainty of an experimental design through bootstrapping. This is particularly relevant in geographical experiments, where populations are small, heterogeneous, and fixed. We then introduce a differentially private framework for digital experimentation which leverages non-private cluster information to improve the fundamental privacy-variance trade-off found in such approaches while maintaining guarantees of user privacy.
(Ikigai)
Understanding Price Elasticity with Causal Inference with Observational Data: A Case Study
Estimating price elasticity is critical for effective decision-making in dynamic markets, but observational data presents challenges due to confounding and the non-random nature of pricing and discounts. In this talk, we explore causal inference techniques for estimating price elasticity from time series data. Building on methods from policy evaluation—such as synthetic interventions, matrix estimation with MNAR data, and multivariate singular spectrum analysis—we develop an approach for counterfactual estimation under continuous interventions. Through a real-world case study, we show how these techniques can recover elasticity measures even amid complex temporal dependencies and limited experimentation.
(Cornell Tech & Netflix)
Learning Surrogate Indices from Historical A/Bs: Adversarial ML for Debiased Inference on Functionals of Ill-Posed Inverses
Experimentation on digital platforms often faces a dilemma: we want rapid innovation but we also want to make decisions based on long-term impact. Usually one resorts to looking at indices (i.e., scalar-valued functions) that combine multiple short-term surrogate outcomes. Constructing indices by regressing long-term metrics on short-term ones is easy with off-the-self ML but suffers bias from confounding and direct (i.e., unmediated) effects. I will discuss how to instead leverage past experiments as instrumental variables (IVs) and some surrogates as negative-control outcomes, with real-world examples from Netflix. There are two key technical challenges to surmount to make this possible. First, past experiments characterize the right surrogate index as a solution to an ill-posed system of moment equations: it does not uniquely identify an index, and approximately solving it does not translate to approximating any solution. We tackle this by developing a novel debiasing method for inference on linear functionals of solutions to ill-posed problems (as average long-term effects are such functionals of the index) and adversarial ML estimators for the solution admitting flexible hypothesis classes, such as neural nets and reproducing kernel Hilbert spaces. Second, even as we observe more past experiments, we have non-vanishing bias in estimating the moment equation implied by each one, since each experiment has a bounded size that is often just barely powered to detect effects. We tackle this by incorporating an instrument-splitting technique into our estimators, leading to a ML analogue of the classic (linear) jackknife IV estimator (JIVE) with guarantees for flexible function classes in terms of generic complexity measures.
(Cornell Tech & Traversal)
AI Agents for Root Cause Analysis
Modern cloud-native applications have significantly grown in scale and complexity, increasing their susceptibility to outages and impacting reliability. In parallel, substantial advancements have improved the observability of these systems. Today, applications emit vast amounts of heterogeneous observability data—logs, traces, metrics, topology graphs, and event streams—that are captured across fragmented observability platforms like Datadog, Elastic, Grafana, and ServiceNow. When outages occur, on-call engineers face the daunting task of manually navigating through billions of logs and countless dashboards to piece together clues and pinpoint root causes. Delays in resolution erode customer trust, while faster remediation demands repeated heroic efforts, draining engineers' productivity and morale.
This talk introduces AI-Site Reliability Engineer (AI-SRE), an ambient multi-agent system developed at Traversal, designed to autonomously diagnose and remediate complex production outages. Leveraging semantic and statistics—the golden signals—AI-SRE harnesses large language models and causal inference techniques to efficiently identify root causes within fragmented observability data streams.
(Columbia)
Randomized Controlled Trials of Service Interventions: The Impact of Capacity Constraints
Randomized controlled trials (RCTs), or experiments, are the gold standard for intervention evaluation. However, the main appeal of RCTs, the clean identification of causal effects, can be compromised by interference, when one subject's treatment assignment can influence another subject's behavior or outcomes. In this paper, we formalise and study a type of interference stemming from the operational implementation of a subclass of interventions we term Service Interventions (SIs): interventions that include an on-demand service component provided by a costly and limited resource (e.g., healthcare providers or teachers).
We show that in such a system, the capacity constraints induce dependencies across experiment subjects, where an individual may need to wait before receiving the intervention. By modeling these dependencies using a queueing system, we show how increasing the number of subjects without increasing the capacity of the system can lead to a nonlinear decrease in the treatment effect size. This has implications for conventional power analysis and recruitment strategies: increasing the sample size of an RCT without appropriately expanding capacity can decrease the study's power. To address this issue, we propose a method to jointly select the system capacity and number of users using the square root staffing rule from queueing theory. In addition, our analysis of congestion-driven interference provides one concrete mechanism to explain why similar protocols can result in different RCT outcomes and why promising interventions at the RCT stage may not perform well at scale.
(USC)
The Ordinary Least Squares Interpolator for Econometrics
Deep learning research has uncovered the benign overfitting phenomenon for overparameterized statistical models, which has drawn significant interest in recent years. Given its simplicity and practicality, the ordinary least squares (OLS) interpolator has become essential to gain foundational insights into this phenomenon. While the properties of OLS are well established in underparameterized settings, its behavior in overparameterized regimes is less explored (unlike for ridge or lasso). However, significant progress has been made of late. We contribute to this literature by providing fundamental algebraic and statistical results for the minimum l2-norm OLS interpolator. Whereas recent research primarily focuses on the OLS interpolator’s prediction accuracy, our results primarily focus on parameter estimation, which is often the central object in statistics, especially in causal inference, as these parameters are endowed with causal meanings. In particular, we provide algebraic equivalents of (i) the leave-k-out formula, (ii) the omitted variable bias formula, and (iii) the Frisch-Waugh-Lovell theorem in the overparameterized regime. Under the Gauss-Markov model, we present statistical results such as an extension of the Gauss-Markov theorem and an analysis of variance estimation under homoskedasticity for the overparameterized regime.
(Cornell)
Considerations in Designing Experiments under Network Interference
Estimating causal effects under network interference is both relevant to many real-world settings, but technically challenging due to correlations in the measured outcomes. In this talk we give an overview of some key considerations that arise when designing a randomized experiment under network interference. First, there is a natural tension that arises between solutions that rely heavily on outcome modelling and less on the structure of the interference network, as opposed to solutions that rely on network specific randomized designs in order to reduce dependence on outcome modelling. Second, when it comes to optimizing the randomized design, we show that there may be multiple objectives in tension due to contributions to the MSE from the outcome model and the interference effects. We illustrate this in a specific example involving two-stage cluster randomization in a staggered rollout setting. We show that bias increases with the number of edges cut in the clustering of the interference network, but variance depends on qualities of the clustering that relate to homophily and covariate balance in addition to the traditional objectives of edges cut.
Paper and poster submissions: March 31th, 2025
Author notification: April 15th, 2025
Workshop date: June 13, 2025 (full day)