Past Talks - Spring 2022

May 11, 2022: Sébastien Bubeck (Microsoft Research):
"
Set Chasing, with an application to online shortest path"

Abstract: Since the late 19th century, mathematicians have realized the importance and generality of selection problems: given a collection of sets, select an element in each set, possibly in a "nice" way. Of particular importance in computer science is the scenario where the ground set is a metric space, in which case it is natural to ask for Lipschitz selection. In this talk I will describe a far-reaching extension of this classical Lipschitz selection problem to an online setting, where sets are streaming to the selector. I will show how Riemannian gradient descent (aka mirror descent) can be used to approach this type of problems. I will illustrate the power of the framework by solving a long-standing problem in online shortest path known as layered graph traversal (introduced by Papadimitriou and Yannakakis in 1989).

April 20, 2022: Gabriel Peyré (CNRS and Ecole Normale Supérieure):
"Scaling Optimal Transport for High dimensional Learning"

Abstract: Optimal transport (OT) has recently gained lot of interest in machine learning. It is a natural tool to compare in a geometrically faithful way probability distributions. It finds applications in both supervised learning (using geometric loss functions) and unsupervised learning (to perform generative model fitting). OT is however plagued by the curse of dimensionality, since it might require a number of samples which grows exponentially with the dimension. In this talk, I will explain how to leverage entropic regularization methods to define computationally efficient loss functions, approximating OT with a better sample complexity. More information and references can be found on the website of our book "Computational Optimal Transport" optimaltransport.github.io/

Bio: Gabriel Peyré is CNRS senior researcher and professor at the Ecole Normale Supérieure, Paris. He works at the interface between applied mathematics, imaging and machine learning. He obtained 2 ERC grants (starting in 2010 and consolidator in 2017), the Blaise Pascal prize from the French academy of sciences in 2017, the Magenes Prize from the Italian Mathematical Union in 2019 and the silver medal from CNRS in 2021. He is invited speaker at the European Congress for Mathematics in 2020. He is the deputy director of the Prairie Institute for artificial intelligence, the director of the ENS center for data science and the former director of the GdR CNRS MIA. He is the head of the ELLIS (European Lab for Learning & Intelligent Systems) Paris Unit (https://ellis-paris.github.io/). He is engaged in reproducible research and code education, in particular through the platform www.numerical-tours.com.

optimaltransport.pdf

April 6, 2022: Shuchi Chawla (UT Austin):
"
Pandora's Box with Correlations: Learning and Approximation"

Abstract: In the Pandora's Box problem, the algorithm is provided with a number of boxes with unknown (stochastic) rewards contained inside them. The algorithm can open any box at some cost, discover the reward inside, and based on these observations can choose one box and keep the reward contained in it. Given the distributions from which the rewards are drawn, the algorithm must determine an order in which to open the boxes as well as when to stop and accept the best reward found so far. In general, an optimal algorithm may make both decisions adaptively based on instantiations observed previously. The Pandora's Box problem and its extensions capture many kinds of optimization problems with stochastic input where the algorithm can obtain instantiations of input random variables at some cost. Previous work on these problems assumes that different random variables in the input are distributed independently. As such it does not capture many real-world settings. In this work, we provide the first algorithms for Pandora's Box-type problems with correlations. In the independent setting, optimal algorithms are non-adaptive and based on the notion of the Gittins index. These techniques fail to extend to the correlated case. We assume that the algorithm has access to samples drawn from the joint distribution on input and provide solutions that require few samples; are computationally efficient; and guarantee approximate optimality.
This is joint work with Evangelia Gergatsouli, Yifeng Teng, Christos Tzamos, and Ruimin Zhang and appeared in FOCS'20.

Bio: Shuchi Chawla holds an Endowed Professorship in Computer Science at UT-Austin and is an Amazon Scholar. Shuchi is a theoretical computer scientist specializing in the areas of algorithm design, and economics and computation. Shuchi received a Ph.D. from Carnegie Mellon University and a B.Tech. from the Indian Institute of Technology, Delhi. Prior to joining UT-Austin, she spent 15 years as a professor of CS at the University of Wisconsin-Madison. She has also previously held visiting positions at the University of Washington and Microsoft Research. Shuchi is the recipient of an NSF Career award, a Sloan Foundation fellowship, and several awards for her research and teaching at UW-Madison. Shuchi recently served as the PC Chair of SODA'20 and EC'21, and currently serves on the editorial boards of the ACM Transactions on Algorithms and the ACM Transactions on Economics and Computation. She serves as a member and the current chair of CATCS.

pandoras-box-correlations-DataScience-4-6-22.pdf

March 16, 2022: Eric Tchetgen (Penn):
"
An Introduction to Proximal Causal Learning"

Abstract: A standard assumption for causal inference from observational data is that one has measured a sufficiently rich set of covariates to ensure that within covariates strata, subjects are exchangeable across observed treatment values. Skepticism about the exchangeability assumption in observational studies is often warranted because it hinges on one’s ability to accurately measure covariates capturing all potential sources of confounding. Realistically, confounding mechanisms can rarely if ever, be learned with certainty from measured covariates. One can therefore only ever hope that covariate measurements are at best proxies of true underlying confounding mechanisms operating in an observational study, thus invalidating causal claims made on basis of standard exchangeability conditions.

Causal learning from proxies is a challenging inverse problem which has to date remained unresolved. In this paper, we introduce a formal potential outcome framework for proximal causal learning, which while explicitly acknowledging covariate measurements as imperfect proxies of confounding mechanisms, offers an opportunity to learn about causal effects in settings where exchangeability based on measured covariates fails. Sufficient conditions for nonparametric identification are given, leading to the proximal g-formula and corresponding proximal g-computation algorithm for estimation, both generalizations of Robins’ foundational g-formula and g-computation algorithm, which account explicitly for bias due to unmeasured confounding. Both point treatment and time-varying treatment settings are considered, and an application of proximal g-computation of causal effects is given for illustration.

Bio: Eric Tchetgen Tchetgen’s primary area of interest is in semi-parametric efficiency theory with application to causal inference and missing data problems. In general, he works on the development and application of statistical and epidemiologic methods that make efficient use of the information in data collected by scientific investigators, while avoiding unnecessary assumptions about underlying data generating mechanisms.

In 2018, Eric Tchetgen Tchetgen joined The Wharton School, University of Pennsylvania as the Luddy Family President’s Distinguished Professor and Professor of Statistics. Prior to that he was Professor of Biostatistics and Epidemiologic Methods at Harvard University. He completed his PhD in Biostatistics at Harvard University in 2006 received his B.S. in Electrical Engineering from Yale University in 1999.

Jan 13, 2022: Piotr Indyk (MIT):
"
Learning-Based Sampling and Streaming "

Abstract: Classical algorithms typically provide "one size fits all" performance, and do not leverage properties or patterns in their inputs. A recent line of work aims to address this issue by developing algorithms that use machine learning predictions to improve their performance. In this talk I will present two examples of this type, in the context of streaming and sampling algorithms. In particular, I will show how to use machine learning predictions to improve the performance of (a) low-memory streaming algorithms for frequency estimation (ICLR’19), and (b) sampling algorithms for estimating the support size of a distribution (ICLR’21). Both algorithms use an ML-based predictor that, given a data item, estimates the number of times the item occurs in the input data set.

The talk will cover material from papers co-authored with T Eden, CY Hsu, D Katabi, S Narayanan, R Rubinfeld, S Silwal, T Wagner and A Vakilian.

Bio: Piotr Indyk is a Thomas D. and Virginia W. Cabot Professor in the Department of Electrical Engineering and Computer Science at MIT. He joined MIT in 2000, after earning PhD from Stanford University. Earlier, he received Magister degree from Uniwersytet Warszawski in 1995. Piotr’s research interests are in the design of efficient algorithms for massive data analysis. His specific interests include nearest neighbor search in high-dimensional spaces, streaming algorithms, sparse recovery, fine-grained complexity and learning-based algorithms. He has received Sloan Fellowship (2003), Packard Fellowship (2003), Simons Investigator Award (2013) and ICML Best Paper Award (2015). His work on sparse Fourier sampling has been named to Technology Review TR10 in 2012, while his work on locality-sensitive hashing has received the 2012 ACM Kanellakis Theory and Practice Award. He is a co-director of Foundations of Data Science Institute (fodsi.us), an NSF-funded project focused on foundations of data science.

FDS-learning.pdf