About this Workshop
A growing number of machine learning problems involve finding subsets of data points. Examples of these range from selecting a subset of labeled or unlabeled data points to selecting subsets of features or model parameters to selecting subsets of pixels, keypoints, sentences, etc. in image segmentation, correspondence, and summarization problems. The workshop would encompass a wide variety of topics ranging from theoretical aspects of subset selection e.g. coresets, submodularity, determinantal point processes, to several practical applications, e.g., time and energy-efficient learning, learning under resource constraints, active learning, human-assisted learning, feature selection, model compression, feature induction, etc.
Call For Papers
Subset selection is relevant and is growing in importance in an increasing number of applications in machine learning. It is a naturally emerging topic and has often been considered in isolation in many applications. We would like to invite original contributions (especially early research work) on the following topics.
Determinantal Point Processes
Submodular functions and their optimization
Applications of Subset Selection
Compute efficient training (training time and energy efficiency)
Active Learning and selecting subsets of unlabelled data for labelling
Human assisted learning
Feature selection and dimensionality reduction
Cost-sensitive feature selection
Rule augmentation and Data programming
Image segmentation, image correspondence, and MAP inference in graphical models.
Data Summarization (e.g. video, image collection, document, news summarization)
Peptide Matching, Proteomics, etc.
Learning of neural set functions
The above are just a few of the potential applications and theoretical directions. If you are working on anything related to subset selection in ML, AI, and deep learning, please consider submitting to and attending our workshop!
Submissions in the form of extended abstracts must be at most 6 pages long (not including references and an unlimited number of pages for supplemental material, which reviewers are not required to take into account) and adhere to the ICML format. We accept submissions of work recently published or currently under review. Submissions should be anonymized. The workshop will not have formal proceedings, but authors of accepted abstracts can choose to have either a link to an arxiv version of their paper or a pdf published on the workshop webpage. If the authors give us an arxiv link, we will link it here from the list of accepted papers on this webpage.
Tuesday, June 8th, 23:59 AOE
Wednesday, June 16th
Deadline for slideslive for selected talks:
Sunday, June 27th 2021
July 10th 2021
Workshop date: Saturday, 24th July 2021
We will be using CMT to handle paper submissions (https://cmt3.research.microsoft.com/SUBSETML2021). Please submit papers before the deadline above.
We received a number of high-quality submissions to SubSetML 2021. We have accepted 33 papers as spotlight presentations and 12 papers as posters, making the total number of accepted papers 45. The full list of accepted papers is here.
We are very excited to have an amazing set of speakers with a wide range of expertise ranging from discrete optimization, submodularity, and coresets, to applications of subset selection such as time and energy-efficient training, model compression, active learning, human-assisted AI, feature selection, and column selection, explainability, and rule induction.
Amin Karbasi (Yale University)
Andreas Krause (ETH Zurich)
Baharan Mirzasoleiman (UCLA)
Cody Coleman (Stanford)
Dan Feldman (Haifa University)
Luc De Raedt (KU Leuven)
Manuel Gomez Rodriguez (MPI-SWS)
Rajiv Khanna (UC Berkeley)
In addition to the above, Jeff Bilmes, Rishabh Iyer and Ganesh Ramakrishnan will also be speaking. Jeff will talk on recent work done in Summary Analytics (smr.ai), while Rishabh and Ganesh will talk about an open source platform DECILE (Data efficient Learning). Both talks are very relevant to this workshop.
Workshop Schedule and Plan
The workshop will consist of ten talks, a spotlight session, a poster session, and a panel discussion, with enough time for scientific discussions throughout a full-day schedule. Each talk will be roughly 30 mins including questions -- 25 mins for the talk and 5 mins for Q&A. We plan to have talks pre-recorded, along with a live Q&A session. The introductory remarks and the panel discussion will be live.
Below is the detailed schedule. The timezone is PDT
15:00 PM - 16:10 PM: Spotlight Session II
Rishabh Iyer (UT Dallas)
Abir De (IIT Bombay)
Ganesh Ramakrishnan (IIT Bombay)
Jeff Bilmes (University of Washington, Seattle)