## Instructor: Shanmuganathan RamanIndian Institute of Technology Gandhinagar Lecture Hall - Shed 5 (202) Lecture Hours - H Slot 1205-1300 Hours on Monday, Tuesday and Friday Office - Shed 5 (216) Teaching Assistant - Rajendra Nagar Office Hours - Any time, Any where you spot me shanmuga@iitgn.ac.in ## DescriptionThe world we live has three dimensions (3D). Human visual system has evolved to perceive all these dimensions. However, the images we capture using conventional cameras are just the 2D projections of the 3D world. In 3D Computer Vision course, we shall explore various techniques for recovering the missing third dimension (depth information) from 2D images using primarily variational methods and projective geometry concepts. The course contents would enable the student to reconstruct the 3D real world scene from 2D images by various methods. The applications of this course range from cultural heritage to medical imaging, from robot navigation to 3D modeling. The assignments and projects associated with the course to be completed using OpenCV, Meshlab, Frankencamera (nVIDIA Tegra 3 and Nokia N900) kits would enable students to develop state-of-the-art 3D computer vision applications. This course is offered as an elective for BTech, MTech, and PhD students of IIT Gandhinagar. This course is also prescribed for minor degree in Computer Science. ## Course ContentsReview of linear algebra, calculus of variations, signals and systems; Camera and image formation – optics; Feature detectors – edge and corner detection; Feature descriptors – SIFT, SURF, feature matching; Shape from X – Reflectance map, BRDF, shape from shading, photometric stereo, depth from defocus, depth from focus, RGB-D images; Single view geometry – finite projective cameras, camera parameters, point correspondences, estimation of camera matrix, direct linear transformation (DLT); Two view geometry – homography, epipolar geometry, estimation of fundamental matrix, image rectification, stereo correspondence, shape from stereo; Three view geometry – trifocal tensors; Motion – optical flow field, Estimation of dense and accurate optical flow field; Multi view geometry – structure from motion, triangulation, factorization, bundle adjustment; Internet vision – mining community photo collections (Flickr, Facebook, etc.). ## Textbooks- Horn, B. K. P. (1986).
*Robot Vision*. The MIT Press. - Hartley R. and Zisserman A. (2004).
*Multiple View Geometry in Computer Vision*, 2nd Edition, Cambridge University Press. - Szeliski, R. (2010).
*Computer Vision: Algorithms and Applications*. Springer-Verlag New York Inc. Available Online. - Nixon, M. S., & Aguado, A. S. (2012).
*Feature Extraction & Image Processing for Computer Vision.*Third Edition. Academic Press. - Davies, E. R. (2012).
*Computer and Machine Vision: Theory, Algorithms, Practicalities*. 4th Edition. Academic Press. - Forsyth, D. A., & Ponce, J. (2015).
*Computer Vision: A Modern Approach*. Second Edition. Prentice Hall of India. - Klette, R. (2014).
*Concise Computer Vision: An Introduction Into Theory and Algorithms.*Springer Publishing Company, Incorporated.
- Marr, D. (2010).
*Vision: A Computational Investigation into the Human Representation and Processing of Visual Information*. The MIT Press. - Sonka, M., Hlavac, V., & Boyle, R. (2014).
*Image Processing, Analysis, and Machine Vision.*4th Edition. Cengage Learning. - Trucco, E. and Verri, A. (1998).
*Introductory Techniques for 3D Computer Vision*, Prentice- Hall. - Prince, S. J. (2012).
*Computer Vision: Models, Learning, and Inference*. Cambridge University Press. Available Online - Ikeuchi, K. (2014).
*Computer Vision: A Reference Guide.*Springer Publishing Company, Incorporated. - Fisher, R. B., Breckon, T. P., Dawson-Howe, K., Fitzgibbon, A., Robertson, C., Trucco, E., & Williams, C. K. (2013).
*Dictionary of computer vision and image processing.*John Wiley & Sons.
The book by Marr provides a viewpoint based on visual neuroscience concepts. The last 5 books can be used as reference for certain topics. Apart from these books, some topics would be taught from selected research papers. Lecture notes you make in the classroom will provide pointers to look into topics in different books listed above. The topics taught in a lecture may have evolved from multiple books and research papers. Reading books would certainly aid lectures but can never replace the lectures. ## Suggested Readings- Trefethen, L. N., and Bau III, D.
*, Numerical Linear Algebra*, SIAM, 1997. - Watkins, D. S.,
*Fundamentals of Matrix Computations,*3rd Edition, John Wiley & Sons, 2010. - Courant, R., Robbins, H., and Stewart, I.,
*What is Mathematics?: An Elementary Approach to Ideas and Methods*, 2nd Edition, Oxford University Press, 1996. - Gelfand, I. M., and Fomin, S. V.,
*Calculus of Variations*, Dover Publications, 2000. - Lathi, B. P.,
*Signal Processing and Linear Systems*, Oxford University Press, 2000.
These suggested readings supplement the textbooks and reference books to understand various mathematical concepts in depth. ## Grading - Mid-semester exam - 30%
- End-semester exam - 40%
- Projects and programming assignments - 30%
## Pre-requisitesExposure to Signals and Systems course at the UG level is required. Programming experience in C/C++/Python is desired for successful completion of the assignments. ## Lecture Schedule and Reading Material
## Discussions/Queries/Doubts Use Piazza |