ICML workshop on Learning with Test-Time Budgets

Location: Rooms L506 & L507 (near ICML registration)

June 20, 2013, Atlanta, USA

The aim of this workshop is to bring together researchers in the emerging topics related to learning and decision-making under budget constraints in uncertain and dynamic environments. This problem has gained attention in several communities including machine learning (ML), signal processing (SP) and information theory (IT). In ML it has also been referred to as test-time cost-sensitive learning and has often been addressed with classifier cascades (e.g. Viola and Jones, 2004). In IT and SP this problem has been referred to as controlled sensing where the goal is to adaptively manage and control multiple degrees of freedom in an information-gathering systems to fulfill the goal of different inference tasks.

Learning and decision-making problems under budget constraints arise in a number of large-scale real-world industrial applications ranging from medical diagnosis, to search engines and surveillance. In these applications budget constraints arise as a result of limits on computational cost, delay, throughput, power and monetary value. For instance, in search engines CPU cost during test-time must be budgeted and accounted for. Modern passenger screening systems impose constraints on throughput.

Learning under test-time budgets departs from the traditional machine learning setting and introduces new exciting challenges. For instance, features are accompanied by costs (e.g. extraction time in search engines or true monetary values in medical diagnosis) and their amortized sum is constrained at test-time. In other settings, a system must maintain a throughput constraint to keep pace with arriving traffic.

All settings have in common that they introduce a new tradeoff between accuracy and cost. Studying this tradeoff between classifier cost and accuracy is an inherent challenge that should be investigated in a principled fashion. This problems lies at the intersection of ML, statistics, stochastic control and information theory. We aim to draw researchers working on foundational, algorithmic and application problems within theses areas. In addition, we aim to supply benchmark datasets (which are typically hard to obtain) and encourage participants to evaluate and compare their novel algorithms.

Call for Abstracts

We invite submissions of extended abstracts on algorithms, optimization and theory related to learning under test-time budgets. Topics of interest include, but not limited to
  • Real-time object recognition and detection
  • Anytime prediction
  • Model compression
  • Cost-sensitive feature selection
  • Multi-stage classifier design
Submissions must be in ICML format, with a maximum of 2 pages (excluding citations). The deadline for submission is May 15th, 2013. Submissions do not have to be anonymous. The workshop committee members will review abstracts based on topicality and potential impact. Accepted abstracts will be presented as posters during the workshop.

Please email extended abstracts to: icml2013budgetedlearning@gmail.com

Important Dates

Deadline for submission: May 15th, 2013
Author notification: June 15th, 2013


Rich Caruana - Microsoft Research

Sergey Karayev - University of California Berkeley

Hal Daume III - University of Maryland

Balazs Kegl - Centre National de la Recherche Scientifique

Dragos D. Margineantu - Boeing Research

Manik Varma - Microsoft Research India, IIT


 9:00  AMKilian WeinbergerWashington University in St. Louis
 9:20  AM

Dragos D. Margineantu

Boeing Research
10:00 AM

Coffee Break

10:20 AMHal Daume IIIUniversity of Maryland
11:00 AM

Rich Caruana

Microsoft Research 
11:40 AMMatthew KusnerWashington University in St. Louis
12:00 PMJoe WangBoston University
12:20 PMLunch Break 
 2:30  PMBalazs KeglCentre National de la Recherche Scientifique
 3:10  PMSergey KarayevUniversity of California Berkeley
 3:50  PMCoffee Break 
 4:30  PMVenkatesh SaligramaBoston University
 5:00  PMManik VarmaMicrosoft Research India, IIT
 5:20  PMPoster Session 

(talk descriptions coming soon!)

Rich Caruana

Title: Model Compression

Often the best performing supervised learning models are large, expensive models such as ensembles of hundreds or thousands of base-level classifiers that may be too large or expensive to use at run-time.  For example, the space required to store these models, and the time and power required to execute them at run-time, can prohibit their use in applications where test sets are extremely large (e.g., web scale apps such as Bing and Google), where storage space is at a premium (e.g., smart phones or commodity servers), and where computational power is limited (e.g., smart phones,  hearing aids, cameras, and satellites). We present a method for compressing large, complex models such as ensembles into much smaller and faster models, usually without significant loss in accuracy.

Sergey Karayev

Title: Dynamic Recognition on a Budget


A recognition system performs actions such as feature extraction or object detection, and combines the results to give the final answer. The actions vary in cost, and different test instances benefit from different actions. For this reason, the selection of the optimal subset of actions must be dynamic (closed-loop) when a test-time cost budget is given. We focus on the Anytime evaluation setting, where the process may be stopped even before the budget is depleted. Action selection is formulated as a Markov Decision Process, solved with reinforcement learning methods. The state of the MDP contains the results (computed features, or output of object detectors). We present different ways of combining these values for the purpose of selecting the next action, and for giving the final answer. Results are given on an illustrative synthetic problem, visual scene recognition and object detection tasks, and a hierarchically-structured classification task. On the latter, we show that Anytime answers can be given for any desired cost budget and level of accuracy.

Balázs Kegl

Title: Sequential prediction: from budgeted learning to sparse coding

Constructing predictors in a sequential decision process is one of the possible ways to formalize budgeted learning. In the general setup, we train an agent which builds the prediction iteratively, making decisions on outputting the current prediction or gathering further information for improving it, based on rewards that mix predictive accuracy and cost of information. After sketching our motivating application of trigger design in experimental physics, we describe the MDDAG algorithm which is one of the possible instantiations of the sequential prediction paradigm. Making the predictor as lean as possible based on information gathered on a _given_ instance leads to data-dependent feature selection similar to sparse coding. In the last part of the talk we explore this connection and argue that budgeted learning is not only a practical solution for real-time prediction, but may also be considered as a natural way to regularize predictors.


Alexander Grubb
Daniel Munoz
J. Andrew Bagnell
Martial Hebert

Structured prediction is becoming increasingly relevant in a variety of applications. At the same time, such prediction often is expensive, possibly exceeding the available computational resources. This work presents a technique for structured prediction in which classifier training can be interrupted anytime, and will provide a model that is optimal under a budget constraint. As the budget is increased, this technique approaches state-of-the-art performance in scene understanding.

Rodrigo Verschae
Javier Ruiz-del-Solar

In 2001 Viola and Jones presented a seminal work on real-time face detection. This work extends the Viola-Jones cascade to a tree structure for multiclass object detection. An instance traverses through the tree structure based on increasingly specific classifiers that gradually extract more features. Feature evaluations and parameters are shared efficiently across different regions of the tree. Results for three techniques of selecting multiclass weak classifiers for the tree structure are detailed.

Devorah Kletenik
Lisa Hellerstein

During test-time, SVM evaluation can be expensive if the features are time-consuming to extract. This work uses an approximation algorithm for the submodular set cover problem to select a subset of all features that may be used during test-time to arrive at the same accuracy as if all features were used. On a synthetic and congressional voting dataset this approach shows reductions in prediction cost, while maintaining prediction accuracy.

Rahul Agrawal
Archit Gupta
Yashoteja Prabhu
Manik Varma

Much research has been done on adversing phrases that pair websites to search queries. Previously, such research has centered around using NLP and ranking methods for gathering such phrases from websites. This work reformulates the problem as a multi-label task and uses random forests to train a model using millions of independent queries. Results on four large-scale datasets demonstrate improvements over previous phrase-recommendation techniques.

Jay Pujara
Hui Miao
Lise Getoor

Consider the problem of making a classification on a group of related instances, such as detecting similar objects in a scene or spam filtering on multiple like-minded users. Previously, for the majority of applications, the most time-demanding portion of inference involved feature extraction. In the setting where multiple predictions must be made jointly, the computational burden lies in the various dependencies between predictions on similar instances. This work develops a model for the cost of joint prediction, resulting in a technique that reduces the inference time on a large-scale dataset.


Principal Investigators
Venkatesh Saligrama - Boston University
Kilian Weinberger - Washington University in St. Louis

Doctoral Candidates
Minmin Chen - Washington University in St. Louis
Matthew Kusner -
Washington University in St. Louis
Kirill Trapeznikov - Boston University
Zhixiang (Eddie) Xu -
Washington University in St. Louis