Beyond Mahalanobis: Supervised Large-Scale Learning of Similarity

News and Dates

  • October 21                      -- Abstract Submission Deadline (deadline has passed)
  • November 7                   -- Notification of Acceptance
  • Friday December 16   -- Workshop date (Sierra Nevada)

Melia Sierra Nevada: Guejar

Tentative program
 7:30 - 7:40am Opening remarks: organizers
 7:40 - 8:10am Invited talk: Samy Bengio (Google), "Online Similarity Learning: From Images to Texts"
 8:10 - 8:40am Invited talk: Prateek Jain (Microsoft Research), "Inductive Regularized Learning of Kernel Functions"
 8:40 - 9:00am Poster spotlights
 9:00am - 8pm Poster viewing during breaks:
 "Mirror Descent for Metric Learning", Gautam Kunapuli and Jude Shavlik
  "Good Similarity Learning for Structured Data", Aurelien Bellet, Amaury Habrard, Marc Sebban
  "Learning Cross-Lingual Similarities", Jan Rupnik, Andrej Muhic, Primoz Skraba 
  "A metric learning perspective of SVM: On the relation of LMNN and SVM", Huyen Do, Alexandros Kalousis, Jun Wang, Adam Woznica
  "Learning sequence neighbourhood metrics", Justin Bayer, Christian Osendorfer, Patrick van der Smagt
  "Ground Metric Learning", Marco Cuturi, David Avis 
  "Adaptive Image Similarity: The Sharpening Match", Erik Learned-Miller
      "Similarity Sensitive Nonlinear Embeddings", Dhruv Batra and Greg Shakhnarovich
 9:00 - 9:30am Coffee break, poster viewing
 9:30 - 10:00             Invited talk: Ruslan Salakhutdinov (U. of Toronto), "Learning class-sensitive similarity metric from few examples"
 10:00 - 10:30am Invited talk: Maya R. Gupta (U. of Washington), "Estimating Similarities Between Tasks for Multi-Task Learning"
 10:30am - 4pm     Break (skiing etc.)
 4:00 - 4:30pm Invited talk: Gert Lanckriet (UCSD), "Learning multi-modal similarity: a novel multiple kernel learning technique"
 4:30 - 5pm Invited talk: Adam Kalai (Microsoft Research), "Actively learning similarity from the crowd"
 5 - 5:20pm Talk: Daniel Lee (UPenn) / Fei Sha (USC), "Learning Discriminative Metrics via Generative Models and Kernel Learning"
 5:20 - 5:40pm Talk: Bert Huang (U. of Maryland), "Learning a Degree-Augmented Distance Metric From a Network"
 5:40 - 6:10pm Coffee break, poster viewing
 6:10 - 6:30pm Talk: Erik Learned-Miller (U. of Massachusetts), "Adaptive Image Similarity: The Sharpening Match"
 6:30 - 6:50pm Talk: Kilian Weinberger (Washington University at St. Louis), "Gradient Boosting for Large Margin Nearest Neighbors"
 6:50 - 7:20pm Invited talk: Alex Berg (Stony Brook University), "Learning similarity for recognition is best solved by first learning to recognize"
 7:20 - 7:50pm Discussion. Tentative topics:
  • Why do similarity learning? It seems in most prediction tasks, in the end k-NN is often bested by a more sophisticated machine specifically learned to predict. So, is there use for similarity learning beyond its use in nearest neighbor classification/regression?
  • In light of the previous question: Can we all come up with a good benchmark setup to evaluate similarity learning, not tied to prediction performance?
7:50 - 8pm Closing remarks (organizers)


The notion of similarity (or distance) is central in many problems in machine learning: information retrieval, nearest-neighbor based prediction, visualization of high-dimensional data, etc. Using statistical learning methods instead to learn similarity functions is appealing, and over the last decade this problem has attracted much attention in the community with several publications in NIPS, ICML, AISTATS, CVPR etc. Much of this work, however, has focused on a specific, restricted approach: learning a Mahalanobis distance, under a variety of objectives and constraints. This effectively limits the setup to learning a linear embedding of the data.

In this workshop, we hope to look beyond this setup, and consider methods that learn non-linear embeddings of the data, either explicitly via non-linear mappings or implicitly via kernels. We will especially encourage discussion of methods that are suitable for large-scale problems increasingly facing practitioner of learning methods: large number of examples, high dimensionality of the original space, and/or massively multi-class problems (e.g. Classification with 10,000+ categories, 10,000,000 image of ImageNet dataset). More broadly, we invite submissions on all similarity learning techniques.

Our goals are to create a comprehensive understanding of the state-of-the-art in similarity learning, via presentation of recent work and to initiate an in-depth discussion on major open questions brought up by research in this area. Among these questions: 

  • Are there gains to be made from introducing non-linearity into similarity models?
  • When the underlying task is prediction (classification or regression) are similarity functions worth learning, instead of attacking the prediction task directly?
  • A closely related question - when is it beneficial to use nearest neighbor based methods, with learned similarity?
  • What is the right loss (or objective) function to minimize in similarity learning?
  • It is often claimed that inherent structure in real data (e.g. low-dimensional manifolds) makes learning easier. How, if at all, does this affect similarity learning?
  • What are similarities/distinctions between learning similarity functions and learning hashing?
  • What is the relationship between unsupervised similarity learning (often framed as dimensionality reduction) and the supervised similarity learning?
  • Are there models of learning nonlinear similarities for which bounds (e.g., generalization error, regret bounds) can be proven? What algorithmic techniques must be employed or developed to scale nonlinear similarity learning to extremely large data sets?

Abstract Submission (Deadline Passed)

We invite submissions of 2-page extended abstracts. Please include author name(s), affiliation(s) and contact information. Style files are available here.

Please email PDF to with subject line “Abstract Submission”. Note: Citations may spill over beyond the two page limit - but please keep the rest of the content of the abstract to two pages!

Abstracts may describe either novel or previously published/presented work or works being presented at the main conference.

Confirmed Speakers