The content of this tutorial is organized around a collection of MATLAB hands-on lab exercises introducing fundamental concepts in visual recognition.

This tutorial has been held in conjunction with:
  • 22nd International Conference on Pattern Recognition (ICPR) - August 24, 2014 - Stockholm, Sweden
  • 17th Int'l Conference on Image Analysis and Processing (ICIAP) - September 9, 2013 - Naples, Italy
A similar (usually shorter) tutorial is also regularly held at:
        Lamberto Ballan, Ph.D. - University of Padova, Italy
        Lorenzo Seidenari, Ph.D. - University of Florence, Italy

Automatic image annotation is an important task, in which the goal is to determine the relevance of annotation terms for images. Several efforts have been made in recent years to design and develop effective and efficient algorithms for visual recognition and retrieval. To this end, a common and successful approach is to quantize local visual features (e.g. SIFT) following the well-known bag-of-visual-words paradigm. Then, a classifier (e.g. SVM) can be learned from a collection of images manually labeled as belonging to an object category or not. The goal of this tutorial is to get basic practical experience with image classification. The participants will be guided to implement a system in Matlab based on bag-of-visual-words image representation and will apply it to image classification. The emphasis of the tutorial will be on the important general concepts rather than in depth coverage of contemporary papers.

Intended audience and expected knowledge to be transferred
  • This is an introductory/intermediate tutorial on visual classification. The intended audience for this tutorial are PhD candidates in computer vision in their first/second year of course or experts of other computer science and pattern recognition fields that want to get an in depth knowledge of what is currently the standard architecture of state-of-the-art visual classification systems.
  • The attendees will get a full overview of a bag-of-visual words recognition pipeline: from the feature computation to the learning of the statistical model of visual concepts. The approach will be decomposed in several steps and each step will be inspected in detail. The tutorial attendee will get the tools to debug each step of a visual recognition system.