CVPR 2014 Tutorial on Large-Scale Visual Recognition

Saturday, June 28th - Full day - Grand Ballroom 2

Speakers:

Ondrej Chum

Czech Technical University

Zaid Harchaoui

INRIA

Primary organizer

and contact

Herve Jegou

INRIA

Organizer

Florent Perronnin

Xerox Research

Organizer

Marc'Aurelio Ranzato

Facebook AI Research

Andrea Vedaldi

Oxford University

Tutorial goals

This tutorial addresses Large-Scale Visual Recognition (LSVR), the problem of understanding visual content (e.g. photos or videos) on a large-scale. This is a topic which has received much attention in the computer vision community in the last few years: as larger datasets have become available [TFF08, DDS09], handling millions of images and thousands of label classes has become the norm rather than the exception [DBL10, WBU10, LRM12, DCM12, JPD12]. Since LSVR is a vast topic, we will mainly focus on two tasks: image retrieval and image classification.

The goals of this tutorial are three-fold:

Provide the audience with the "tools" to process such large datasets.
Show the convergence between large-scale retrieval and large-scale classification, two problems which have been traditionally addressed separately.
Show that LSVR does not necessarily require massive computational resources (although such resources can help, of course...)

The tutorial is complemented with free publicly available software:

VLFeat: http://www.vlfeat.org/
INRIA's Fisher vector implementation: http://lear.inrialpes.fr/src/inria_fisher
VGG's encoding methods evaluation toolkit: http://www.robots.ox.ac.uk/~vgg/software/enceval_toolkit/
Yael library for exact matching: http://gforge.inria.fr/projects/yael
- PQ-codes toy Matlab implementation: http://people.rennes.inria.fr/Herve.Jegou/projects/ann.html
J-SGD for large-scale learning: http://lear.inrialpes.fr/src/jsgd/

Schedule

The tutorial will consist of short talks (1h or less) each one covering a specific topic, and each one given by a recognized expert in his field.

morning:

8:30am - 8:40am: Introduction (Zaid Harchaoui)
8:40am - 9:25am: Part I: Efficient matching (Herve Jegou)
9:25am - 10:15am: Part II: Geometry for large-scale retrieval (Ondrej Chum)
10:15am - 10:45am: COFFEE BREAK
10:45am - 11:50am: Part III: Large-scale machine learning (Zaid Harchaoui)

afternoon:

1:30pm - 2:30pm: Part IV: Large-scale visual recognition with deep learning (Marc'Aurelio Ranzato)
2:30pm - 3:25pm: Part V: Input embeddings, from shallow to deep (Andrea Vedaldi)
3:25pm - 3:55pm: COFFEE BREAK
3:55pm - 4:55pm: Part VI: Output embedding for large-scale visual recognition (Florent Perronnin)

Here is a list of references.

Sponsors and financial support

The tutorial is supported by the MSR-INRIA Joint Centre, the "Gargantua" project (CNRS-Mastodons), the "Khronos" project (Labex Persyval-Lab, ANR-11-LABX-0025), and the Fire-ID project (ANR-12-CORD-016).

Page updated

Google Sites

Report abuse