COST : NIPS
2011
Workshop on
Computational Tradeoffs in Statistical Learning
Sierra Nevada, Spain
Dec 16, 2011 Montebajo Basketball Court

Overview
Since its early days, the field of Machine Learning has focused on
developing computationally tractable algorithms with good learning
guarantees. The vast literature on statistical learning theory has led
to a good understanding of how the predictive performance of different
algorithms improves as a function of the number of training
samples. By the same token, the welldeveloped theories of
optimization and sampling methods have yielded efficient computational
techniques at the core of most modern learning methods. The separate
developements in these fields mean that given an algorithm we have a
sound understanding of its statistical and computational
beahvior. However, there hasn't been much joint study of the
computational and statistical complexities of learinng, as a
consequence of which, little is known about the interaction and
tradeoffs between statistical accuracy and computational
complexity. Indeed a systematic joint treatment can answer some very
interesting questions: what is the best attainable statistical error
given a finite computational budget? What is the best learning method
to use given different computational constraints and desired
statistical yardsticks? Is it the case that simple methods outperform
complex ones in computationally impoverished scenarios?
At its core, the PAC framework aims to study learning through the lens
of computation. However, the thrust is on separating polynomialtime
from computationally intractable algorithms. However all
polynomialtime computations are hardly equivalent, and the difference
between linear vs quadratic dependence on problem parameters can have
a profound effect on the applicability of an algorithm. Understanding
the tradeoffs between statistical accuracy and computational demands
in this situation is of paramount importance.
The need for such a theory is more compelling now than ever before
since we routinely face training corpuses with billions of examples,
and often, an even larger number of parameters to be estimated. The
emergence of web and mechanical turk as sources of training data often
stretches learning algorithms to the point that the bottleneck is no
longer the number of examples, but the amount of computation available
to process the examples. A theory to principally choose from a
multitude of learning methods based on the properties of training
examples as well as the computational resources available would be of
clear interest. Another way to pose the same problem would be to
design algorithms that can take as input a computational constraint
and try to learn the best hypothesis they can based on the available
budget and data.
There have been some works that try to address different facets of the above problem. Researchers working on massive datasets in the CS theory community look at streaming methods that aim to impose constraints on both the computational and storage requirements of the algorithms. Online learning presents one particular way of dealing with a computational budget, by processing as many samples as possible with the computational budget.
There have been some more relevant works in the machine learning community in the last few years. Bottou and Bousquet (2008)
compare the amount of computation needed to attain a certain
statistical error for a few routinely used optimization
algorithms. ShalevShwartz and Srebro (2009) show how stochastic gradient descent applied to SVM optimization can experience an inverse dependence on number of training sample in the regime of large
datasets. In some more recent works, ShalevShwartz and coauthors have also used cryptographic conjectures to establish the computational hardness of certain learning problems. On the algorithmic front, coarsetofine learning provides a nice framework to systematically incorporate computational considerations, using computational cost as a regularization term in the objective of the learning method. Other budgeted algorithms such as budgeted SVMs and budgeted perceptrons try to admit hard budget constraints on the running time and storage of the algorithm.
The goals of our workshop are:
 To draw the attention of machine learning researchers to this rich and emerging area of problems and to establish a community of researchers that are interested in understanding these tradeoffs.
 To define a number of common problems in this area and to encourage future research that is comparable and compatible.
 To expose the learning community to relevant work in fields such as CS theory and convex optimization.

We would like to welcome highquality submissions on topics including but not limited to:
 Fundamental statistical limits with bounded computation
 Tradeoffs between statistical accuracy and computational costs
 Computationpreserving reductions between statistical problems
 Algorithms to learn under budget constraints
 Budget constraints on other resources (e.g. bounded memory)
 Computationally aware approaches such as coarsetofine learning
Interesting submissions in other relevant topics not listed above are welcome too. Due to the time constraints, most accepted submissions will be presented as poster spotlights.

Keynote Speakers

Organizers

Program Committee
 Léon Bottou
 Olivier Chapelle
 John Duchi
 Claudio Gentile
 John Langford
 Maxim Raginsky
 Pradeep Ravikumar
 Ohad Shamir
 Karthik Sridharan
 David Weiss
 Nati Srebro



