Invited Speakers

The workshop will feature a great line-up of invited talks by:

  • Elias Bareibnboim: Professor of Computer Science at Columbia University. His research interest is causal inference, especially data combination and the intersection of causal inference and reinforcement learning.

  • Mark van der Laan: Jiann-Ping Hsu/Karl E. Peace Professor of Biostatistics and Statistics at the University of California, Berkeley. His research area is survival analysis, semiparametric statistics, and causal inference.

  • Claire Vernade: Research Scientist at DeepMind in London. Her research area is bandit algorithms, reinforcement learning.

  • Razieh Nabi: Rollins Assistant Professor of Biostatistics and Bioinformatics at Emory University. Her research interest is causal inference, algorithmic fairness, missing data, graphical models.

  • Rui Song: Professor of Statistics at North Carolina State University. Her interests include causal Inference, precision health and financial economics.

  • Susan Athey: Economics of Technology Professor at Stanford Graduate School of Business. Her research is in the areas of industrial organization and econometrics. Her current research focuses on the design of auction-based marketplaces and the economics of the internet, primarily on online advertising.


Line-up of Talks


  • Sequential Adaptive Designs for Learning Optimal Individualized Treatment Rules with Formal Inference (Mark van der Laan)

In this talk we review various of our contributions on sequential adaptive designs.

Firstly, we review sequential targeted adaptive designs in which one adapts to complete data records of previously enrolled subjects, thereby relying on short-term clinical outcomes. In particular, we show how TMLE and online super-learning can be used to preserve unbiased formal inference in such adaptive designs. We demonstrate the power of this type of design for optimizing treatment for sepsis patients.

Secondly, we discuss the natural extension of these sequential adaptive designs in continuous time in which case one adapts to all previously collected data, including many partially observed data structures of previously enrolled subjects. In this setting, we emphasize the utilization of time-specific surrogate outcomes as a way to still have power to learn powerful optimal rules while adapting over time the choice of surrogate so that we best approximate the optimal treatment rule w.r.t. the final clinical outcome. We show some simulation results demonstrating the performance of such adaptive designs in continuous time.

Finally, we present sequential adaptive designs for a single time series in which one learns the optimal rule for maximizing the next short term outcome, again using TMLE and online super learning to allow for formal statistical inference.

It is shown how all formal results rely on theory of TMLE and martingale processes.

This is a join work with: Antoine Chambaz, Wenjing Zheng, Ivana Malenica, Aurelien Bibaut, Aaron Hudson, Wenxin Zhang.

  • Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting ( Claire Vernade)

We consider off-policy evaluation in the contextual bandit setting for the purpose of obtaining a robust off-policy selection strategy, where the selection strategy is evaluated based on the value of the chosen policy in a set of proposal (target) policies. We propose a new method to compute a lower bound on the value of an arbitrary target policy given some logged data in contextual bandits for a desired coverage. The lower bound is built around the so-called Self-normalized Importance Weighting (SN) estimator. It combines the use of a semi-empirical Efron-Stein tail inequality to control the concentration and Harris' inequality to control the bias. The new approach is evaluated on a number of synthetic and real datasets and is found to be superior to its main competitors, both in terms of tightness of the confidence intervals and the quality of the policies chosen.

  • (un)fairness in sequential decision making as a challenge (Razieh Nabi)

In this talk, we focus on automated sequential decision making which are increasingly used in socially-impactful settings like social welfare,hiring, criminal justice, etc. A particular challenge in the context of automated sequential decision making is to avoid the “perpetuation of injustice,” i.e., when maximizing utility maintains, reinforces, or even introduces unfair dependencies between sensitive features, decisions, and outcomes. In this talk, we show how to use methods from causal inference and constrained optimization to make optimal but fair decisions that would “break the cycle of injustice” by correcting for the unfair dependence of both decisions and outcomes on sensitive features.

  • Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process (Rui Song)

In this talk, we consider constructing a confidence interval for a

target policy’s value offline based on pre-collected observational

data in infinite horizon settings. Most of the existing works assume

no unmeasured variables exist that confound the observed actions. This

assumption, however, is likely to be violated in real applications

such as healthcare and technological industries. We show that with

some auxiliary variables that mediate the effect of actions on the

system dynamics, the target policy’s value is identifiable in a

confounded Markov decision process. Based on this result, we develop

an efficient off-policy value estimator that is robust to potential

model misspecification and provides rigorous uncertainty

quantification.