Computer Vision - CSE440 (Spring 2018)

Indian Institute of Information Technology, Sri City (IIITS)

Instructor: Dr. Shiv Ram Dubey

Class time: Tuesday, Thursday 12.00 noon-1.30 pm

Class location: 304, IFMR Building

Overview: The goal of computer vision is to compute properties of the three-dimensional world from digital images. Problems in this field include reconstructing the 3D shape of an environment, determining how things are moving, and recognizing people and objects and their activities, all through analysis of images and videos. This course will provide an introduction to computer vision, with topics including image formation, feature detection, motion estimation, image mosaics, and object and face detection and recognition. Applications of these techniques include building 3D maps, creating virtual characters, organizing photo and video databases, human computer interaction, video surveillance, automatic vehicle navigation, and mobile computer vision. In this course, student will implement several computer vision algorithms throughout the semester.

Detailed Syllabus:

Feature Detection, Description, Correspondence and Alignment: Introduction and Overview, Light, Image Formation, Filtering, Edge Detection, Feature Detection, Harris Corner Detection, Invariance and Blob Detection, Feature Descriptors and Matching, Image Transformations, Image Alignment, RANSAC, Hough Transform

Recognition and Learning: Intro to Recognition, Viola-Jones Face Detection, Bag-of-Words Model, Classifiers, Neural Networks, Convolutional Neural Networks (CNN)

Perspective and 3D Geometry: Camera Models, Single-view Geometry and Calibration, Image Stitching, Epipolar Geometry, Stereo, Structure from Motion

Advanced Topics: CNN Applied to Computer Vision such as Detection, Classification, Recognition, Segmentation, etc.

Prerequisites: Data Structure, Basic Probability/Statistics, a good working knowledge of any programming language (python, matlab, C/C++), Linear algebra, Vector calculus.

Grading: Assignments and the term project should include explanatory/clear comments as well as a short report describing the approach, detailed analysis, and discussion/conclusion.

  • 10% Mid-Exam-1

  • 10% Mid-Exam-2

  • 20% End-Exam

  • 30% Programming Assignments

  • 30% Term Project

Recommended Books (optional)

Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, 2016 [PDF]

Simon Prince, Computer Vision: Models, Learning, and Interface, Cambridge University Press, 2012 [PDF]

Richard Szeliski, Computer Vision: Algorithms and Applications, Springer, 2010, [PDF]

Forsyth and Ponce, Computer Vision: A Modern Approach, Prentice Hall, 2002, [PDF]

Mubarak Shah, Fundamentals of Computer Vision, 1997 [PDF]

Palmer, Vision Science, MIT Press, 1999,

Duda, Hart and Stork, Pattern Classification (2nd Edition), Wiley, 2000,

Koller and Friedman, Probabilistic Graphical Models: Principles and Techniques, MIT Press, 2009,

Strang, Gilbert. Linear Algebra and Its Applications 2/e, Academic Press, 1980.

Programming

Python will be preferred programming environment for the assignments. Following book (Python programming samples for computer vision tasks) is freely available: Python for Computer Vision

Lecture Notes