ICML 2016 Workshop on

Data-Efficient Machine Learning

24 June 2016, Marriott Marquis (Astor Room), New York

Recent efforts in machine learning have addressed the problem of learning from massive amounts data. We now have highly scalable solutions for problems in object detection and recognition, machine translation, text-to-speech, recommender systems, and information retrieval, all of which attain state-of-the-art performance when trained with large amounts of data. In these domains, the challenge we now face is how to learn efficiently with the same performance in less time and with less data. Other problem domains, such as personalized healthcare, robot reinforcement learning, sentiment analysis, and community detection, are characterized as either small-data problems, or big-data problems that are a collection of small-data problems. The ability to learn in a sample-efficient manner is a necessity in these data-limited domains. Collectively, these problems highlight the increasing need for data-efficient machine learning: the ability to learn in complex domains without requiring large quantities of data.

This workshop will discuss the diversity of approaches that exist for data-efficient machine learning, and the practical challenges that we face. There are many approaches that demonstrate that data-efficient machine learning is possible, including methods that

  • Consider trade-offs between incorporating explicit domain knowledge and more general-purpose approaches,

  • exploit structural knowledge of our data, such as symmetry and other invariance properties,

  • apply bootstrapping and data augmentation techniques that make statistically efficient reuse of available data,

  • use semi-supervised learning techniques, e.g., where we can use generative models to better guide the training of discriminative models,

  • generalize knowledge across domains (transfer learning),

  • use active learning and Bayesian optimization for experimental design and data-efficient black-box optimization,

  • apply non-parametric methods, one-shot learning and Bayesian deep learning.

The objective of this interdisciplinary workshop is to provide a platform for researchers from a variety of areas, spanning transfer learning, Bayesian optimization, bandits, deep learning, approximate inference, robot learning, healthcare, computational neuroscience, active learning, reinforcement learning, and social network analysis, to share insights and perspectives on the problem of data-efficient machine learning, discuss challenges and to debate the roadmap towards more data-efficient machine learning.

Invited Speakers