Vision Papers To Read

Last update: May 21, 2015.

I will list the important papers to read in some research directions. The papers should be good base covering different research methodologies. Later, through the reference section, you can find others. Papers in bold worth starting with them.

Historical Papers

Some old research work influenced vision progress...reading them is encouraged.

  • Normalized Cut, Shi & Malik, 1997
  • SIFT, David Lowe, 1999
  • Face Detection, Viola & Jones, 2001
  • Histogram of Gradients (HoG) Dalal & Triggs, 2005
  • Spatial Pyramid Matching, Lazebnik, Schmid & Ponce, 2006
  • Deformable Part Model, Felzenswalb, McAllester, Ramanan, 2009

Action/Activity Classification

  • Large-scale Video Classification with Convolutional Neural Networks
  • Two-Stream Convolutional Networks for Action Recognition in Video
  • 3D Convolutional Neural Networks for Human Action Recognition
  • Sequential Deep Learning for Human Action Recognition
  • BMVC 13 - Spatio-temporal convolutional sparse autoencoder for sequence classification
    • Extract features by auto-encoders [Unsupervised Learning]. Feed features to LSTM to learn temporal sequence => classify videos actions.

Action Detection

  • Finding Action Tubes. CVPR 14.

Image Classification

Deep Learning Trend:

  • ImageNet Classification with Deep Convolutional Neural Networks
  • Visualizing and Understanding Convolutional Networks
  • Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
  • Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
  • Going deeper with convolutions

Object Detection and Recognition

Deep Learning Trend:

  • Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR 2014.
  • OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks. 2015

Hand Crafted Features Trend:

  • Object Detection with Discriminatively Trained Part Based Models, PAMI 2009
  • Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations
  • Beyond sliding windows: Object localization by efficient subwindow search. CVPR 2008.
  • Selective Search for Object Recognition. IJCV 2013.
  • Free-shape subwindow search for object localization. In: CVPR. (2010).
  • Combining efficient object localization and image classification. In: ICCV. (2009).
  • An exemplar model for learning object classes. In: CVPR. (2007).
  • Histograms of oriented gradients for human detection, CVPR 2005

Image segmentation

Deep Learning Trend:

  • Fully Convolutional Networks for Semantic Segmentation

Hand Crafted Features Trend:

  • Constrained Parametric Min-Cuts for Automatic Object Segmentation, CVPR 2010.
  • Class segmentation and object localization with superpixel neighborhoods, in ICCV 2009
  • Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context, in IJCV 2009.
  • Robust higher order potentials for enforcing label consistency, in CVPR 2008.
  • Semantic texton forests for image categorization and segmentation, in CVPR 2008
  • Object recognition by integrating multiple image segmentation, in ECCV 2008
  • Efficient graph based image segmentation. IJCV (2004).