# Stanford WTF

Women in Theory Forum

Welcome to the Stanford Women in Theory Forum (WTF). We are group of women-identifying CS-theorist-identifying (and theory-adjacent) folks who meet monthly-ish for socializing, snacks, and a research talk. If you'd like to join our mailing list, please email the organizers, or sign up here: https://mailman.stanford.edu/mailman/listinfo/womens-theory-forum

Organizers: Tselil Schramm (tselil-at-stanford-dot-edu) and Mary Wootters (marykw-at-stanford-dot-edu)

## Schedule

AY 2022-2023

If the weather is nice, we will meet outside in the engineering quad (in the treepit with the whiteboards). If the weather is not so nice, we'll meet inside, probably Gates 415. Join the mailing list for announcements about locations.

Wednesday October 19, 3:30pm: Ellen Vitercik

Title: Theoretical Foundations of Machine Learning for Cutting Plane Selection

Abstract: Cutting-plane methods have enabled remarkable successes in integer programming over the last few decades. State-of-the-art solvers integrate a myriad of cutting-plane techniques to speed up the underlying tree search algorithm used to find optimal solutions. In this talk, we provide sample complexity guarantees for learning high-performing cut-selection policies tailored to the instance distribution at hand. This talk is based on joint work with Nina Balcan, Siddharth Prasad, and Tuomas Sandholm from NeurIPS'21.

Wednesday November 16, 3:30pm: Sumegha Garg

Title: Memory requirements for detecting bias of a coin

Abstract: Given a sequence of n independent tosses of a coin biased towards either heads or tails with probability 1/2 + β, the aim is to determine which way the coin is biased. Let's call this the coin problem. We study the memory required to detect the bias in the streaming setting. Note that the coin problem becomes more difficult as β becomes smaller. Statistically, it can be solved whenever β = Ω(1/√n), using counting. We first show that for β = O(1/√n), counting is essentially optimal, that is, \Omega(log n) memory is necessary, using an information-theoretic proof [BGW’20]. Then, we will see how O(log log n) memory suffices to solve the coin problem for β = O(1/n^{-1/3}) [BGZ'21]. Based on joint works with Mark Braverman, David Woodruff and Or Zamir.

Tuesday January 24, 3pm: Charlotte Gates Peale

Title: Outcome Indistinguishability and its Sample Complexity

Abstract: Prediction algorithms based on machine learning are becoming increasingly influential on major decisions that affect individual’s lives in settings such as medical diagnoses, loan applications, and educational admissions processes. It is critical that the predictions of these algorithms can provide good guarantees with respect to many desiderata such as accuracy, non-discrimination against certain populations, and robustness to distribution shifts.

Unfortunately, the standard measure of predictor quality, overall classification error, may fail to address these concerns beyond a guarantee of overall accuracy. In this talk, I will provide an overview of Outcome Indistinguishability, a recent learning paradigm introduced by Dwork, Kim, Reingold, Rothblum, and Yona that allows for more flexible tools for reasoning about the quality of predictors beyond their overall accuracy. I will also explore recent results on the sample complexity of this new learning task.

Based on joint work with Lunjia Hu and Omer Reingold (https://arxiv.org/abs/2203.04536, https://arxiv.org/abs/2211.09101).

Tuesday April 18, 1:30pm: Shuangping Li

Title: Spectral clustering in the geometric block model

Abstract: Gaussian mixture block models are distributions over graphs that strive to model modern networks: to generate a graph from such a model, we associate each vertex with a latent feature vector sampled from a mixture of Gaussians, and we add edge if and only if the feature vectors are sufficiently similar. The different components of the Gaussian mixture represent the fact that there may be different types of nodes with different distributions over features---for example, in a social network each component represents the different attributes of a distinct community. Natural algorithmic tasks associated with these networks are embedding (recovering the latent feature vectors) and clustering (grouping nodes by their mixture component).

In this talk, we focus on clustering and embedding graphs sampled from high-dimensional Gaussian mixture block models, where the dimension of the latent feature vectors goes to infinity as the size of the network goes to infinity. This high-dimensional setting is most appropriate in the context of modern networks, in which we think of the latent feature space as being high-dimensional. We analyze the performance of canonical spectral clustering and embedding algorithms for such graphs in the case of 2-component spherical Gaussian mixtures and begin to sketch out the information-computation landscape for clustering and embedding in these models.

This is based on joint work with Tselil Schramm.

Tuesday May 23, 1:30pm: June Vuong

Title: Learning to Generate Multimodal Distributions via Early-Stopped Langevin Diffusions

Abstract: There has been a recent explosion of interest in generative modeling using a score matching approach, i.e. attempting to learn the gradient of the log-likelihood of the true distribution. It is by now well-known that vanilla score matching has significant difficulties learning multimodal distributions, and a number of modifications of score matching have been proposed to overcome this difficulty --- for example, by attempting to additionally learn the score function for noised versions of the ground truth (a potentially more difficult learning task). Is there a natural way to sample multimodal distributions using just the vanilla score? Inspired by a long line of related experimental work, we prove that the Langevin diffusion with early stopping, initialized at the empirical distribution, and run on a score function estimated from data can successfully learn natural multimodal distributions (mixtures of log-concave distributions from parametric families) with sample complexity polynomial in the dimension. Joint work with Frederic Koehler.

AY 2021-2022

The schedule has been lost to time.