Background readings:
A. Oliva and A. Torralba, "Modeling the shape of the scene: A holistic representation of the spatial envelope," International Journal of Computer Vision, vol. 42, no. 3, pp. 145-175, May 2001. http://dx.doi.org/10.1023/A:1011139631724
A. Efros, A. C. Berg, G. Mori, and J. Malik, "Recognizing action at a distance," ICCV 2003, pp. 726-733 vol.2. http://dx.doi.org/10.1109/ICCV.2003.1238420
N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in CVPR '05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, pp. 886-893. http://dx.doi.org/10.1109/CVPR.2005.177
Contemporary readings:
P. F. Felzenszwalb, R. B. Girshick, and D. McAllester, "Cascade Object Detection with Deformable Part Models", CVPR 2010. http://dx.doi.org/10.1109/CVPR.2010.5539906 Video Lecture: http://videolectures.net/cvpr2010_girshick_codd/ [Allie]
T. Deselaers and V. Ferrari, "Global and efficient self-similarity for object classification and detection", CVPR 2010. http://dx.doi.org/10.1109/CVPR.2010.5539775 Video Lecture: http://videolectures.net/cvpr2010_deselaers_gess/ [Allie]