Introduction to Computer Vision

University Politehnica of Bucharest

Faculty of Automatic Control and Computer Science

Professor: Dr. Marius Leordeanu

Computer Vision is an increasingly important part of Artificial Intelligence. Its goal is to make computers „see” and „understand” the visual world in an intelligent and meaningful manner, similar to the way humans do it. The field’s objectives include the ability to automatically reconstruct the spatial 3D geometry of the scene from images and video sequences, the capacity to recognize objects and categories of objects, and to identify people and understand their actions and interactions with other people or objects.

Many methods and models from Computer Vision enable today’s computers to intelligently interpret and understand images and video under various scenarios, as required by different applications. For example, today’s computers are able to detect and recognize human faces with almost perfect accuracy. They can also recognize different human actions in real-time with RGB-D cameras. The list of working computer vision applications is increasing at an exponential rate and the field is starting to mature, with a visible impact on industry and human life. Driverless cars, strongly relying on computer vision methods, have been developed; computer games recognizing human actions are already available and best selling on the market; programs that automatically detect and recognize human faces are part of virtually every photo camera and image processing software. Even commonly used online maps with 3D street views are made possible by computer vision methods that automatically align 3D laser scanned urban, large scale scenes. Computer vision connects theoretical and experimental results from many disciplines, such as Mathematics, Statistics, Computer Science, Machine Learning, Artificial Intelligence and Neuroscience (Cognitive/Brain Science).

During this class we will start exploring the world of computer vision and discuss both theoretical as well as practical basic aspects of this field. By solving homework assignments and participating to laboratory classes students will acquire a solid hands-on experience on current computer vision problems, applications and methods.

Prerequisites

Students should be proficient in computer programming, have undergraduate level knowledge of Mathematics and Computer Science and be able to quickly learn elements of Linear Algebra, Geometry, Probability and Statistics, Optimization and some Graph Theory. We will cover the necessary theoretical material during the classroom and the laboratory classes. Prior experience with image processing will be useful, but it is not mandatory.

Lectures:

I. Introduction to Computer Vision (2 hours)

II. Low level processing and local features (4 hours)

  • Linear filters
  • Edges and image derivatives
  • Color and Texture
  • Edge detection

II. Grouping and global mid-level reasoning (4 hours)

  • Fitting lines and curves
  • Robust fitting, RANSAC, Hough transform
  • Clustering
  • Segmentation
  • Finding contours
  • Boundary detection revisited

III. Camera geometry and multiple views (4 hours)

  • Camera model, image formation
  • Planar homography
  • Image warping
  • Epipolar geometry
  • Stereo

IV. Object Recognition (6 hours)

  • Invariant local features
  • Object recognition with local features
  • Bags-of-words representations
  • Matching
  • Part-based models
  • Object category recognition and detection

V. Video understanding (6 hours)

  • Motion descriptors
  • Object Tracking
  • Background subtraction
  • Dense motion estimation and optical flow
  • Recognizing Actions and Activities
  • Object discovery

Laboratory:

During the laboratory classes we will discuss the homework assignments and implement related methods. Students will become familiar with Matlab, and learn algorithms related to lectures and homework as well as theoretical notions related to the course topics (linear algebra, graph theory, probability and statistics, optimization).

Laboratory Toptics:

1. MATLAB Tutorial (2 hours)

2. Computational Tools for Computer Vision in Matlab (2 hours)

3. Low-level processing: Edge detection, filtering and local features (2 hours)

4. Intro to Learning and Classification (2 hours)

5. Mid-level reasoning: grouping and segmentation (2 hours)

6. Intro to Linear Algebra Methods (2 hours)

7. Object recognition methods (6 hours)

8. Video processing methods (6 hours)

Evaluation

a) Grades will be determined as follows: 4 Homeworks (40%) + 1 Midterm Exam (20%) + Class Participation (10%) + Final exam (30 %)

b) To pass: minimum 50% of the final exam and minimum of 50% of the rest.

Class and laboratory participation and discussion will be encouraged and graded (10% of the final grade). Materials and articles needed for study will be made available free of charge in class and through the course website.

Bibliography and textbooks

[1]. Computer Vision: Algorithms and Applications, by Rick Szeliski

(draft available online).

[2] Computer Vision : A Modern Approach, David A. Forsyth and Jean Ponce

(draft available online)

Other useful bibliography:

[3]. Visual Object Recognition, Kristen Grauman and Bastian Leibe

[4]. Computer Vision, Linda G. Shapiro and George C. Stockman

[5]. Introductory Techniques for 3D Computer Vision, E. Trucco and A. Verri

[6]. Pattern classification, Richard O. Duda, Peter E. Hart, and David G. Stork

[7]. Pattern Recognition and Machine Learning. Christopher M. Bishop

Most research papers related to the topics discussed are available online.

Google Scholar is an excellent resource to look for research articles and papers: http://scholar.google.com/