This one-day workshop will feature recent advances in causal inference broadly at the intersection of statistics, operations research, and engineering. The workshop aims to combine theory and applications and bring together researchers across academia and industry. The event will be organized in such a way to promote meaningful interactions and discussions, and foster interdisciplinary collaboration, through a mix of talks, poster presentations, and networking breaks.
As part of the workshop, we will be accepting short papers that are broadly at the intersection of causal inference, engineering, and operations research. This will be non-archival and accepted papers will be presented as posters along with short spotlight talks. Additionally, the top three papers will be put on a fast track submission to a top journal [authors will be provided options]. Papers should be submitted via the following google form. Authors are asked to submit extended abstracts that are at most 8-pages in length. Please use the standard PER format.
If there are any questions, please contact dennis [dot] shen [at] marshall [dot] usc [dot] edu.
9:00-9:15am: Opening remarks
9:15-10:00am: Spotlight talks
10:00-10:30am: Coffee break (Idea Hub)
10:30-12:00pm: Speaker session
Angela Zhou: Structured Offline RL via Reward Filtering and Orthogonal Q-Contrasts
Kyra Gan: Causal Deep RL: Activating Minimal Markovian Representation via Multi-Order State Exposure
12:00-1:30pm: Lunch break (Rogel Ballroom)
1:30-3:00pm: Speaker session
George Chen: Measuring the Impact of Medication Non-adherence on When Adverse Outcomes Happen in Schizophrenia
Sarah Cen: Large-Scale, Longitudinal Study of Large Language Models During the 2024 US Election Season
3:00-3:30pm: Coffee break (Idea Hub)
3:30-5:00pm: Speaker session
Dogyoon Song: Regression Adjustment in High-Dimensions: A Design-Based Finite-Sample View
Colin Fogarty: Sample Splitting and Two-player Games in Observational Studies with Hidden Bias
(CMU)
The 2024 US presidential election is the first major contest to occur in the US since the popularization of large language models (LLMs). Building on lessons from earlier shifts in media (most notably social media's well studied role in targeted messaging and political polarization) this moment raises urgent questions about how LLMs may shape the information ecosystem and influence political discourse. While platforms have announced some election safeguards, how well they work in practice remains unclear. Against this backdrop, we conduct a large-scale, longitudinal study of 12 models, queried using a structured survey with over 12,000 questions on a near-daily cadence from July through November 2024. Our design systematically varies content and format, resulting in a rich dataset that enables analyses of the models' behavior over time (e.g., across model updates), sensitivity to steering, responsiveness to instructions, and election-related knowledge and "beliefs." In the latter half of our work, we perform four analyses of the dataset that (i) study the longitudinal variation of model behavior during election season, (ii) illustrate the sensitivity of election-related responses to demographic steering, (iii) interrogate the models' beliefs about candidates' attributes, and (iv) reveal the models' implicit predictions of the election outcome. To facilitate future evaluations of LLMs in electoral contexts, we detail our methodology, from question generation to the querying pipeline and third-party tooling.
(CMU)
This study quantifies the association between non-adherence to medications and adverse outcomes in schizophrenia patients. We frame the problem using survival analysis, focusing on the time to the earliest of several adverse events (early death, involuntary hospitalization, jail stay). We combine standard causal inference methods (T-learner, S-learner, nearest neighbor matching) with various survival models to estimate individual and average treatment effects, where treatment corresponds to medication non-adherence. Analyses are repeated using different amounts of longitudinal information (3-12 months). Using data from Allegheny County PA, we find strong evidence that non-adherence advances adverse outcomes by approximately 1 to 4 months. Ablation studies confirm that county-provided risk scores adjust for key confounders, as their removal amplifies the estimated effects. Subgroup analyses by medication formulation (injectable vs oral) and medication type consistently show that non-adherence is associated with earlier adverse events. We caution that although we apply causal inference, we only make associative claims and discuss assumptions needed for causal interpretation.
(UMich Ann Arbor)
In observational studies, design decisions are known to strongly impact reported robustness of a study’s findings to unmeasured confounding. As the optimal choices depend on properties of the data themselves, splitting one’s data into planning and analysis samples appears particularly appealing: one can use a planning sample to inform the subsequent analysis of the observational study, targeting choices which improve performance in a sensitivity analysis. When viewed through the lens of a two player game however, sample splitting may put the practitioner at a disadvantage relative to approaches which use the whole data to inform design choices: the practitioner plays first, making decisions using the planning sample, and then imagines nature’s worst-case response to that decision in the analysis sample, whereas in reality hidden bias has realized before practitioner analyses the data. We characterize decision sets under which sample splitting is innocuous in terms of the limiting power of a sensitivity analysis. We provide a novel minimax theorem, while highlighting the potential breakdown when our theorem’s conditions are violated. We apply our method to investigate the effects of poverty on the emergence of cardiovascular disease risk factors in children and adolescents. We discover adverse consequences on outcomes related to body composition, physical activity, and tobacco exposure.
(Cornell Tech)
Reinforcement learning (RL) relies on the Markov property for guaranteed performance, but real-world applications often lack well-defined states given raw observed variables. While causal RL has attracted growing interest, existing work typically assumes Markovian states are already given and focuses on using causality to accelerate learning, leaving a fundamental gap: given a longitudinal causal graph over observed variables, how does one construct MDP states that provably satisfy the Markov property? We address this by providing a procedure that constructs a minimal state representation and proves its correctness. The significance of this construction, however, depends on the learning setting. In deep RL, we observe that the minimal representation alone empirically fails to improve performance, indicating that neural networks cannot directly exploit Markovian minimality. To address this, we propose MOSE (Multi-Order State Exposure), which feeds multi-order historical state constructions (orders 1 through $W$) into the same Q-function. MOSE consistently outperforms both the minimal state construction and single-window policies across common benchmarks and synthetic datasets. Adding the minimal representation in MOSE can further improve performance. Our results establish a core principle for causal deep RL: minimal sufficiency is not enough — controlled redundancy is necessary to unlock the benefit of causal state information.
(UC Davis)
Regression adjustment is a classical way to improve precision in randomized experiments, but its finite-sample behavior is poorly understood when covariates are high-dimensional and p may exceed n. This talk presents a design-based, non-asymptotic analysis of regression-adjusted average treatment effect estimation under complete randomization. The framework yields oracle confidence intervals with finite-sample validity and explicit, instance-adaptive widths, without requiring a correctly specified outcome model and while allowing p>n. The key idea is a swap sensitivity analysis that separates stochastic fluctuation from design bias: the former is controlled by a variance-adaptive martingale argument and Freedman’s inequality, while the latter is bounded using Stein’s method of exchangeable pairs. The resulting bounds make explicit how covariate geometry affects concentration, bias, and the usefulness of adjustment. Time permitting, I will also discuss ongoing work to derive data-driven confidence envelopes and broader prospects for design-based concentration methods in causal inference.
(USC)
We study offline reinforcement learning under structural conditions where the dynamics may depend on many state variables, but optimal decisions depend only on a sparse, reward-relevant subset of the state. This “decision-theoretic sparsity” that optimal policy and value functions admit lower-dimensional structure, although full-state transition estimation can be difficult. First, we develop a reward-relevance-filtered approach for linear function approximation that modifies thresholded Lasso within least-squares policy evaluation and fitted Q-iteration to focus estimation on reward-relevant components. Second to improve robustness, we propose a structured difference-of-Q framework via orthogonal learning: a dynamic generalization of R-learning that targets Q-function contrasts sufficient for policy optimization, accommodates black-box nuisance estimators of Q and the behavior policy, and yields robust policy optimization guarantees under a margin condition. Together, these methods formalize and exploit reward-relevant structure to improve statistical efficiency and robustness in offline RL.
Paper and poster submissions: May 8, 2026
Author notification: May 15, 2026
Workshop date: June 12, 2026 (full day)