Invited Speakers

Steve Hanneke

Title: Recent Advances in the Theory of Active Learning (Slides ppt)

Abstract: The past year has seen exciting advances in the theory of active learning, including new general active learning strategies yielding significantly sharper label complexity guarantees. Some of these new results overturn widely-held conceptions of when and how active learning can improve over passive learning. In this talk, I will highlight some of these recent advances, and discuss a few of the open problems currently ripe for exploration.

Adam Kalai (Microsoft Research)

Title: Crowdsourcing Feature Discovery via Adaptively Chosen Comparisons

Abstract: We introduce an unsupervised approach to efficiently discover the underlying features in a data set via crowdsourcing. Our queries ask crowd members to articulate a feature common to two out of three displayed examples. In addition, we also ask the crowd to provide binary labels to the remaining examples based on the discovered features. The triples are chosen actively based on the labels of the previously discovered features on the data set. In two natural models of features, hierarchical and independent, we show that a simple adaptive algorithm, using such "two-out-of-three" similarity queries, recovers all features with less labor than any non- adaptive algorithm. Experimental results validate the theoretical findings. Joint work with James Zou and Kamalika Chaudhuri.

Andreas Krause (ETH Zurich)

John Langford and Tzu-Kuo Huang (Microsoft Research)

Title: Efficient Agnostic Active Learning (Slides pdf)

Abstract: Can you use active learning with the same security that something reasonable will happen as in normal supervised learning? The canonical example (equivalent to binary search) for active learning is neither robust nor efficient on complex representation. Can we fix that?

The claim is 'yes', by synthesizing agnostic active learning and learning reductions. We present here a new _efficient_ agnostic active learning algorithm 'active cover' which operates by reduction to a learning oracle while satisfying and qualitatively improving on the theoretical guarantees of all prior efficient general approaches. We also conduct a study of such algorithms discovering that Active Cover yields a substantial improvement in the label complexity over previous approaches.

Maja Temerinac-Ott (Univ. of Freiburg)

Title: Active learning for multidimensional experimental spaces of biological responses to perturbagens (Slides pdf)

Abstract: The scale and complexity of biological systems makes biological research a fertile domain for active learning. This is because for complex biological systems, time and cost constraints make it impossible to do all possible experiments. Previous applications of active learning in biology have been limited, and can be divided between retrospective studies, the goal of which is just to demonstrate the usefulness of active learning algorithms, and prospective studies, in which active learning is actually used to drive experimentation. The latter ones are rare. Furthermore, past studies mostly considered unidimensional active learning, where only a single variable is explored (i.e., to choose which drugs to test in order to find active drugs for a single target). However, most biological systems have multiple, interacting components, and thus require multidimensional active learning (i.e., choose which pairs of drugs and targets to test in order to model possible effects of multiple drugs on multiple targets). This is far more challenging, but holds the promise of making it feasible to study systems for which exhaustive experimentation is not possible. In this talk, we will describe various applications of multidimensional active learning to biological systems. Considerations discussed will include the choice of the modeling method, incorporation of prior information using similarity matrices and, most importantly, how to know when a model is good enough to stop doing experiments.

Jeff Schneider (Carnegie Mellon University)

Burr Settles (Duolingo, FAWM)

Title: Inter-Active Learning with Queries on Instances and Features (Slides pdf)

Abstract: In this talk, I will discuss a few projects aimed at "closing the loop" for interactive natural language annotation. In particular, I describe two systems that combine active and semi-supervised learning by asking humans to label both instance queries (e.g., passages of text) and feature queries (e.g., advice about words and the class labels they imply). Empirical results from real user studies show that these systems are better than state-of-the-art passive learning and even instance-only active learning, in terms of accuracy given a fixed budget of annotation time. The results are quite replicable and also provide insight into human annotator behavior, suggesting how human factors can and should be taken into account for interactive machine learning.