Computer Vision

Indian Institute of Information Technology, Sri City (IIITS)

Instructor: Dr. Shiv Ram Dubey

Current Offering:

Spring 2020

Overview: Nowadays, the use of visual information technology is growing exponentially. Most of the big IT companies like Google, Microsoft, Amazon, Facebook, etc. are working over the visual data analysis. Many startups also came in recent years in Computer Vision area. Computer Vision also has very strong relevance in Robotics and Industrial Automation. It can be utilized very effectively in smart manufacturing, medical field, biometrics area, etc. The goal of computer vision is to compute the properties of the three-dimensional world from digital images. Problems in this field include reconstructing the 3D shape of an environment, determining how things are moving, and recognizing people and objects and their activities, all through analysis of images and videos. This course will provide an introduction to computer vision, with topics including image formation, feature detection, motion estimation, image mosaics, 3D shape reconstruction, object/face detection and recognition, and deep learning. Applications of these techniques include building 3D maps, creating virtual characters, organizing photo and video databases, human computer interaction, video surveillance, automatic vehicle navigation, robotics, virtual and augmented reality, medical imaging, and mobile computer vision.

Prerequisites: Data Structure, Basic Probability/Statistics, a good working knowledge of any programming language (python, matlab, C/C++), Linear algebra, Vector calculus. No prior experience with computer vision is assumed, although previous knowledge of Image Processing or Machine Learning will be helpful.

Recommended Books (optional)

Richard Szeliski, Computer Vision: Algorithms and Applications, Springer, 2010, [PDF]

Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, 2016 [PDF]

Simon Prince, Computer Vision: Models, Learning, and Interface, Cambridge University Press, 2012 [PDF]

Forsyth and Ponce, Computer Vision: A Modern Approach, Prentice Hall, 2002, [PDF]

Mubarak Shah, Fundamentals of Computer Vision, 1997 [PDF]

Palmer, Vision Science, MIT Press, 1999,

Duda, Hart and Stork, Pattern Classification (2nd Edition), Wiley, 2000,

Koller and Friedman, Probabilistic Graphical Models: Principles and Techniques, MIT Press, 2009,

Strang, Gilbert. Linear Algebra and Its Applications 2/e, Academic Press, 1980.

Programming

Python will be preferred programming environment for the assignments. Following book (Python programming samples for computer vision tasks) is freely available: Python for Computer Vision

Course Ethics

•All class work is to be done independently.

•It is best to try to solve problems on your own, since problem solving is an important component of the course, and exam problems are often based on the outcome of the assignment problems.

•You are allowed to discuss class material, assignment problems, and general solution strategies with your classmates. But, when it comes to formulating or writing solutions you must work alone.

•You may use free and publicly available sources, such as books, journal and conference publications, and web pages, as research material for your answers. (You will not lose marks for using external sources.)

•You may not use any paid service and you must clearly and explicitly cite all outside sources and materials that you made use of.

•I consider the use of uncited external sources as portraying someone else's work as your own, and as such it is a violation of the University's policies on academic dishonesty.

•Instances will be dealt with harshly and typically result in a failing course grade.