2024 IMS International Conference on Statistics and Data Science (ICSDS)
December 16-19, 2024, Nice, France
Algorithmic Stability for Regression and Classification
Rina Foygel Barber University of Chicago
Monday, December 16th, 2024
Abstract:
In a supervised learning setting, a model fitting algorithm is unstable if small perturbations to the input (the training data) can often lead to large perturbations in the output (say, predictions returned by the fitted model). Algorithmic stability is a desirable property with many important implications such as generalization and robustness, but testing the stability property empirically is known to be impossible in the setting of complex black-box models. In this work, we establish that bagging any black-box regression algorithm automatically ensures that stability holds, with no assumptions on the algorithm or the data. Furthermore, we construct a new framework for defining stability in the context of classification, and show that using bagging to estimate our uncertainty about the output label will again allow stability guarantees for any black-box model. This work is joint with Jake Soloff and Rebecca Willett.
Short Bio:
Rina Foygel Barber is the Louis Block Professor of Statistics at the University of Chicago, where she has been faculty since Jan. 2014. Prior to joining the faculty, she was a NSF postdoctoral fellow at Stanford University advised by Emmanuel Candès, and received her PhD in Statistics at University of Chicago in 2012 advised by Mathias Drton and Nathan Srebro. Rina's research focuses on developing theory and methodology for statistical problems in challenging modern settings, including distribution-free inference, high-dimensional multiple testing, and sparse and low-rank estimation, as well as nonconvex optimization with applications in medical imaging. Her research has been recognized by awards including the COPSS Presidents' Award (2020), the Peter Gavin Hall IMS Early Career Prize (2020), the IMS Medallion Lecture and Award (2022), and a MacArthur Fellowship (2023). She was elected as a Fellow of the Institute of Mathematical Statistics (IMS) in 2023.
Outcome Indistinguishability and its Diverse Applications
Cynthia Dwork Harvard University
Tuesday, December 17th, 2024
Abstract:
Outcome Indistinguishability, a notion from algorithmic fairness with roots in complexity theory, frames learning not as loss minimization – the dominant paradigm in supervised machine learning -- but instead as satisfaction of a collection of “indistinguishability” constraints. Outcome Indistinguishability considers two alternate worlds on individual-outcome pairs: in the natural world, individuals’ outcomes are generated by Real-Life’s true distribution; in the simulated world, individuals’ outcomes are sampled according to a predictive model. Outcome Indistinguishability requires the learner to produce a predictor in which the two worlds are computationally indistinguishable. The notion has provided a generous springboard, first and foremost in machine learning, and also in complexity theory.
Short Bio:
Cynthia Dwork, Gordon McKay Professor of Computer Science at Harvard, and Affiliated Faculty at Harvard Law School and Department of Statistics, is renowned for placing privacy-preserving data analysis on a mathematically rigorous foundation. She has also made seminal contributions in cryptography and distributed computing, and she spearheaded the investigation of the theory of algorithmic fairness. Dwork is the recipient of numerous awards including the IEEE Hamming Medal, the RSA award for Excellence in Mathematics, the Dijkstra, Gödel, and Knuth Prizes, and the ACM Paris Kanellakis Theory and Practice Award. Dwork is a member of the US National Academy of Sciences and the US National Academy of Engineering, and is a Fellow of the American Academy of Arts and Sciences and the American Philosophical Society.
Challenges with Covariate Shift: From Prediction to Causal Inference
Martin Wainwright Massachusetts Institute of Technology
Wednesday, December 18th, 2024
Abstract:
In many modern uses of predictive methods, there can be shifts between the distributional properties of training data compared to the test data. Such mismatches can cause dramatic reductions in accuracy that remain mysterious. How to find practical procedures that mitigate such effects in an optimal way? In this talk, we discuss the fundamental limits of problems with covariate shift, and simple procedures that achieve these fundamental limits. Our talk covers both the challenges of covariate shift in non-parametric regression, and also for semi-parametric problems that arise from causal inference and off-policy evaluation.
Short Bio:
Martin Wainwright is the Cecil H. Green Professor in Electrical Engineering and Computer Science and Mathematics at MIT, and affiliated with the Laboratory for Information and Decision Systems and Statistics and Data Science Center. He is broadly interested in statistics, machine learning, information theory and algorithms. He has received a number of awards and recognition including being a John Simon Guggenheim Fellow, Alfred P. Sloan Foundation Fellow, the COPSS Presidents’ Award from the Joint Statistical Societies, a Section Lecturer with the International Congress of Mathematicians in 2014, and the Blackwell Lectureship and Award from the Institute of Mathematical Statistics in 2017. He has co-authored several books, including on graphical models with Michael Jordan, on sparse statistical modeling with Trevor Hastie and Rob Tibshirani, and a solo-authored book on high dimensional statistics.
Perturbation Data Science
Peter Bühlmann ETH Zürich
Thursday, December 19th, 2024
Abstract:
`Perturbation Data Science' refers to the development of data science methods and algorithms that leverage the effects of perturbations -- often unspecific -- within data. In this presentation, we will focus on a key aspect of this framework, exploring the links between invariance learning, robustness, and causality. We will highlight how these concepts have been applied to medical domain adaptation and discuss their potential, along with initial results, in drug combination discovery.
Short Bio:
Peter Bühlmann is Professor of Mathematics and Statistics and Director of Foundations of Data Science at ETH Zürich. He received his Ph.D. from ETH Zürich in 1993, and after spending three years as a postdoctoral fellow and Neyman Assistant Professor at UC Berkeley, he returned to ETH Zürich as a faculty member in 1997. His research interests include high-dimensional statistics, causality, and interdisciplinary applications in biomedical sciences. He is a Fellow of the Institute of Mathematical Statistics (IMS) and served as IMS President in 2022-2023, a Fellow of the American Statistical Association, and he was Co-Editor of the Annals of Statistics 2010-2012. He received a Doctor Honoris Causa from the Université Catholique de Louvain in 2017, the Neyman Lectureship and Award 2018 and the Wald Lectureship and Award 2024 from the Institute of Mathematical Statistics, the Guy Medal in Silver 2018 from the Royal Statistical Society, and he is an elected Member of the German National Academy of Sciences Leopoldina since 2022.