Past Talks - Spring 2020

May 15, 2020: Amin Karbasi (Yale University):
"User-Friendly Submodular Maximization"

Abstract: Submodular functions model the intuitive notion of diminishing returns. Due to their far-reaching applications, they have been rediscovered in many fields such as information theory, operations research, statistical physics, economics, and machine learning. They also enjoy computational tractability as they can be minimized exactly or maximized approximately.

The goal of this talk is simple. We see how a little bit of randomness, a little bit of greediness, and the right combination can lead to pretty good methods for offline, streaming, and distributed solutions. I do not assume any background on submodularity and try to explain all the required details during the talk.

Bio: Amin Karbasi is currently an assistant professor of Electrical Engineering, Computer Science, and Statistics at Yale University. He is also a senior visiting scientist at Google NY. He has been the recipient of the National Science Foundation (NSF) Career Award 2019, Office of Naval Research (ONR) Young Investigator Award 2019, Air Force Office of Scientific Research (AFOSR) Young Investigator Award 2018, DARPA Young Faculty Award 2016, National Academy of Engineering Grainger Award 2017, Amazon Research Award 2018, Google Faculty Research Award 2016, Microsoft Azure Research Award 2016, Simons Research Fellowship 2017, and ETH Research Fellowship 2013. His work has also been recognized with a number of paper awards, including Medical Image Computing and Computer Assisted Interventions Conference (MICCAI) 2017, International Conference on Artificial Intelligence and Statistics (AISTAT) 2015, IEEE ComSoc Data Storage 2013, International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2011, ACM SIGMETRICS 2010, and IEEE International Symposium on Information Theory (ISIT) 2010 (runner-up). His Ph.D. thesis received the Patrick Denantes Memorial Prize 2013 from the School of Computer and Communication Sciences at EPFL, Switzerland.

user-friendly submodular max.pdf

Apr 17, 2020: Shachar Lovett (UC San Diego) :
"The power of asking more informative questions about the data"

Abstract: Many supervised learning algorithms (such as deep learning) need a large collection of labelled data points in order to perform well. However, what is easy to get are large amounts of unlabelled data. Labeling data is an expensive procedure, as it usually needs to be done manually, often by a domain expert. Active learning provides a mechanism to bridge this gap. Active learning algorithms are given a large collection of unlabelled data points. They need to smartly choose a few data points to query their label. The goal is then to automatically infer the labels of many other data points.

In this talk, we will explore the option of giving active learning algorithms additional power, by allowing them to have richer interaction with the data. We will see how allowing for even simple types of queries, such as comparing two data points, can exponentially improve the number of queries needed in various settings. Along the way, we will see interesting connections to both geometry and combinatorics, and a surprising application to fine grained complexity.

Based on joint works with Daniel Kane, Shay Moran and Jiapeng Zhang.

Bio: Shachar Lovett is an Associate Professor at the UC San Diego Computer Science department. He obtained his PhD from the Weizmann Institute in 2010. He is interested in the role that structure and randomness play in computation and mathematics, in particular in computational complexity, optimization, machine learning and coding theory; as well as pseudo-randomness, explicit constructions and additive combinatorics. He is a recipient of an NSF Career award and a Sloan fellowship.

data-sci-lovett.pptx

Mar 27, 2020: Sujay Sanghavi (University of Texas Austin):
"Towards Model Agnostic Robustness"

Abstract: It is now common practice to try and solve machine learning problems by starting with a complex existing model or architecture, and fine-tuning/adapting it to the task at hand. However, outliers, errors or even just sloppiness in training data often lead to drastic drops in performance.

We investigate a simple generic approach to correct for this, motivated by a classic statistical idea: trimmed loss. This advocates jointly (a) selecting which training samples to ignore, and (b) fitting a model on the remaining samples. As such this is computationally infeasible even for linear regression. We propose and study the natural iterative variant that alternates between these two steps (a) and (b) - each of which individually can be easily accomplished in pretty much any statistical setting. We also study the batch-SGD variant of this idea. We demonstrate both theoretically (for generalized linear models) and empirically (for vision and NLP neural network models) that this effectively recovers accuracy in the presence of bad training data.

This work is joint with Yanyao Shen and Vatsal Shah and appears in NeurIPS 2019, ICML 2019 and AISTATS 2020.

Bio: Sujay Sanghavi is an Associate Professor at the University of Texas, Austin. He received a PhD in ECE and an MS in Math and ECE from the University of Illinois, and was a postdoc at LIDS in MIT. Sujay’s research focus is rigorous methodological innovation in machine learning, using ideas from optimization, statistics and graph theory. He has received early career awards from the NSF and DoD, and currently leads the NSF TRIPODS Institute on the Foundations of Data Science at UT.

Sujay is also interested in learning from and applying his ideas in industry. He has been a Visiting Scientist at Google Research, a senior quant at Engineers Gate and is currently a Principal Scientist and Amazon Scholar at Amazon.

SujayUMass2020.pdf

Feb 28, 2020: Jon Kleinberg (Cornell University):
"Fairness and Bias in Algorithmic Decision-Making"

Abstract: As data science has broadened its scope in recent years, a number of domains have applied computational methods for classification and prediction to evaluate individuals in high-stakes settings. These developments have led to an active line of recent discussion in the public sphere about the consequences of algorithmic prediction for notions of fairness and equity. In part, this discussion has involved a basic tension between competing notions of what it means for such classifications to be fair to different groups. We consider several of the key fairness conditions that lie at the heart of these debates, and in particular how these properties operate when the goal is to rank-order a set of applicants by some criterion of interest, and then to select the top-ranking applicants.

The talk will be based on joint work with Sendhil Mullainathan and Manish Raghavan.

data-sci-kleinberg.pdf