Statistical and Structural Recognition of Human Actions

ECCV 2010 Tutorial
Ivan Laptev (INRIA / École Normale Supérieure)
Greg Mori (Simon Fraser University)


Tuesday 5 September 2010 AM
Creta Maris Hotel, Hersonissos, Crete


Resources

ECCV Tutorial slides part 1 [ppt with videos tar.gz] [pdf]
ECCV Tutorial slides part 2 [ppt with videos tar.gz] [pdf]

Code

    Datasets




    Overview


    Automatic recognition of human actions and gestures is an important topic in computer vision. Solving this problem is essential for a number of emerging industries including indexing of professional and user-generated video archives, automatic video surveillance, and human-computer interaction. Moreover, understanding the function and the meaning of many object and scene classes is intertwined with understanding human actions which highlights the importance of action recognition in solving other computer vision problems.

    The field of human action recognition has evolved considerably over the recent years. Local video representations are now used extensively in combination with statistical recognition methods. At the same time, new powerful structural methods have emerged, presenting solutions to action recognition based on recent advances in structured learning. This course will give an introduction into novel trends in statistical and structural action recognition and will illustrate ideas with examples of successful methods from recent literature. In particular, we will cover bag-of-features action recognition and will discuss alternative local feature representations and their extensions. We will consider current issues in human actions datasets and will address weakly supervised and unsupervised approaches for human actions. We will next present advances in structural modeling of human poses and cover recent structured learning methods for action recognition. While this course will mostly cover action recognition in video, we will also discuss action recognition from still images such as in the Action Classification Taster Competition of PASCAL VOC 2010.


    Lecture Topics


    1. Introduction
      1. Human actions in science and applications
      2. Historical overview
      3. Problem definitions
      4. Datasets
    2. Statistical methods
      1. Early silhouette and tracking-based methods
      2. Motion-based similarity measures
      3. Template-based methods
      4. Local space-time features
      5. Bag-of-Features action recognition
      6. Weakly-supervised methods
    3. Structural methods
      1. Pose estimation and action recognition
      2. Action recognition in still images  
      3. Human interactions and dynamic scene models
      4. Conclusions and future directions

    Lecturer Biographies

    Ivan Laptev is currently a full-time researcher in the WILLOW team at INRIA – Paris and École Normale Supérieure (ENS). He received his PhD in Computer Science from the Royal Institute of Technology (KTH) in 2004 and his Master of Science degree from the same institute in 1997. He was a research assistant at the Technical University of Munich (TUM) during 1997-1999 and he joined INRIA in 2004. Ivan’s main research interests concern visual understanding of dynamic scenes including recognition of human actions, scenes and object categories. Ivan has published over 30 papers at international conferences and journals on computer vision, he serves as an associate editor of Image and Vision Computing Journal and as an area chair of CVPR 2010, he is a regular member of program committees of major international conferences on computer vision. Ivan has been awarded “Prime d’Excellence Scientifique” in 2010.
     
    Greg Mori is currently an assistant professor in the School of Computing Science at Simon Fraser University. He received the Ph.D. degree in Computer Science from the University of California, Berkeley in 2004. He received an Hon. B.Sc. in Computer Science and Mathematics with High Distinction from the University of Toronto in 1999. He spent one year (1997-1998) as an intern at Advanced Telecommunications Research (ATR) in Kyoto, Japan. Dr. Mori’s research interests are in computer vision, and include object recognition, human activity recognition, human body pose estimation. He serves on the program committee of major computer vision conferences (CVPR, ECCV, ICCV), and was the program co-chair of the Canadian Conference on Computer and Robot Vision (CRV) in 2006 and 2007. He is an Associate Editor for IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI). Dr. Mori received the Excellence in Undergraduate Teaching Award from the SFU Computing Science Student Society in 2006. Dr. Mori received the Canadian Image Processing and Pattern Recognition Society (CIPPRS) Award for Research Excellence and Service in 2008.