Foundations of Data Science - Virtual Talk Series
Friday April 17, 2020, 11 am PT, 2 pm ET
Shachar Lovett (UC San Diego)
Shachar Lovett is an Associate Professor at the UC San Diego Computer Science department. He obtained his PhD from the Weizmann Institute in 2010. He is interested in the role that structure and randomness play in computation and mathematics, in particular in computational complexity, optimization, machine learning and coding theory; as well as pseudo-randomness, explicit constructions and additive combinatorics. He is a recipient of an NSF Career award and a Sloan fellowship.
Title: The power of asking more informative questions about the data
Abstract: Many supervised learning algorithms (such as deep learning) need a large collection of labelled data points in order to perform well. However, what is easy to get are large amounts of unlabelled data. Labeling data is an expensive procedure, as it usually needs to be done manually, often by a domain expert. Active learning provides a mechanism to bridge this gap. Active learning algorithms are given a large collection of unlabelled data points. They need to smartly choose a few data points to query their label. The goal is then to automatically infer the labels of many other data points.
In this talk, we will explore the option of giving active learning algorithms additional power, by allowing them to have richer interaction with the data. We will see how allowing for even simple types of queries, such as comparing two data points, can exponentially improve the number of queries needed in various settings. Along the way, we will see interesting connections to both geometry and combinatorics, and a surprising application to fine grained complexity.
Based on joint works with Daniel Kane, Shay Moran and Jiapeng Zhang.