News and Dates~~October 21 -- Abstract Submission Deadline~~(deadline has passed)~~November 7 -- Notification of Acceptance~~- Friday December 16 -- Workshop date (Sierra Nevada)
Venue: Melia Sierra Nevada: Guejar Tentative program
OverviewThe notion of similarity (or distance) is central in many problems in machine learning: information retrieval, nearest-neighbor based prediction, visualization of high-dimensional data, etc. Using statistical learning methods instead to learn similarity functions is appealing, and over the last decade this problem has attracted much attention in the community with several publications in NIPS, ICML, AISTATS, CVPR etc. Much of this work, however, has focused on a specific, restricted approach: learning a Mahalanobis distance, under a variety of objectives and constraints. This effectively limits the setup to learning a linear embedding of the data. In this workshop, we hope to look beyond this setup, and
consider methods that learn non-linear embeddings of the data, either
explicitly via non-linear mappings or implicitly via kernels. We will
especially encourage discussion of methods that are suitable for large-scale
problems increasingly facing practitioner of learning methods: large number of
examples, high dimensionality of the original space, and/or massively multi-class
problems (e.g. Classification with 10,000+ categories, 10,000,000 image of
ImageNet dataset). More broadly, we
invite submissions on all similarity learning techniques. Our goals are to create a comprehensive understanding of the state-of-the-art in similarity learning, via presentation of recent work and to initiate an in-depth discussion on major open questions brought up by research in this area. Among these questions: - Are there gains to be made from introducing non-linearity into similarity models?
- When the underlying task is prediction
(classification or regression) are similarity functions worth learning,
instead of attacking the prediction task directly?
- A closely related question - when is it beneficial to use nearest neighbor based methods, with learned similarity?
- What is the right loss (or objective) function to minimize in similarity learning?
- It is often claimed that inherent structure in
real data (e.g. low-dimensional manifolds) makes learning easier. How,
if at all, does this affect similarity learning?
- What are similarities/distinctions between learning similarity functions and learning hashing?
- What is the relationship between unsupervised
similarity learning (often framed as dimensionality reduction) and the
supervised similarity learning?
- Are there models of learning nonlinear similarities for which bounds (e.g., generalization error, regret bounds) can be proven? What algorithmic techniques must be employed or developed to scale nonlinear similarity learning to extremely large data sets?
We invite submissions of 2-page extended abstracts. Please
include author name(s), affiliation(s) and contact information. Style files are available here. Please email PDF to nips11simworkshop@ttic.edu with
subject line “Abstract Submission”. Note: Abstracts may describe either novel or previously published/presented
work or works being presented at the main conference.
Confirmed Speakers- Prateek Jain (Microsoft)
- Alex Berg (Stony Brook)
- Ruslan Salakhutdinov (Toronto)
- Gert Lanckriet (UCSD)
- Samy Bengio (Google)
- Maya Gupta (UW)
- Adam Kalai (Microsoft)
Organizers |