Computer Vision [2022]

EC60002: Computer Vision

*Update: This course EC60002 (or/and EC61032, EC60064) might be considered as a pre-requisite to carry out the one year MTech project in some cases in the Image Processing and Computer Vision (IPCV) laboratory of the E&ECE department* [obviously along with the EC60501 and EC60502 PG core courses]

Course Syllabus:
[Medium of communication: English]

This is an advanced PG and PhD level course with state-of-the-art syllabus, which assumes that a registrant is well-versed with basic concepts of linear algebra, probability and random process, optimization, and signal processing.

Preferred Course Prerequisites

EC61409: Neural Networks and Applications [offered in the autumn semesters]

If you have not taken the above course, it is advised that you go through introductory video lectures at least on the following Deep Learning (DL) topics before the part (2) of this course starts:
Multilayer Perceptron (MLP), Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN) and Generative Adversarial Networks (GAN).
For example, you may consider the relevant lectures of the CS231n course offered by Stanford University (do a web search!) or those from the Deep Learning course by Prof. P. K. Biswas of IIT Kharagpur's E&ECE dept provided by NPTEL in Youtube. The relevant video lectures from EC61409 will be also be shared with those who register for this course.

The EC60002 course is divided into 4 unequal parts:

(1) Color -
Color Photography, Color Matching and Reproduction, Color Coordinate Systems & Color Differences, Color Representation & Color Vision, Color Filter Array, Demosaicing & Deinterlacing, Color Balancing and Gamma, Color Constancy and Retinex

(2) Features -
Local Descriptors: Corner, SIFT, LBP, HOG
Edge Detection and Linking: LoG, Canny
Image Features: Steerable Filters, Shape & Texture, DL based Perceptual Features
Motion: Optical Flow, Block Matching, Parametric Motion, Global Motion, Flownet
Depth: Depth from Structure, DL based Depth, Structure & Depth from Egomotion
Full-reference Quality: SSIM, FSIM

(3) Processing -
Generic Filters: LMMSE Filter, Order-statistic Filter, Bilateral Filter, Nonlinear Means, Non-local Means
Video Filters: Spatio-temporal Filtering, Blur Reduction
Deconvolution: Unsharp masking, LMMSE & Bayes based Deconvolution, CNN based Deconvolution
Super-resolution (SR): Splines, Single Image SR, CNN SR, GAN SR
Low-light Image Enhancement: Retinex based Contrast Enhancement, Illumination Enhancement, DL based Enhancement
Dehazing: Prior-based Single Image Dehazing, DL based Image and Video Dehazing

(4) Decision-making -
Saliency Computation: Image and Video Saliency, DL based Image and Video Saliency
Segmentation: Superpixels, Mean Shift and Mode Seeking Segmentation, CNN based Semantic Image and Video Segmentation
Object Detection and Recognition: Contrast based Salient Object Detection (SOD), DL based SOD, Video SOD, You Only Look Once (YOLO), Region based CNN (R-CNN) Variants
Retargeting and Inpainting: Seam Carving for Image and Video Retargeting, Image Inpainting, DL based Inpainting
Categorization and Captioning: Bag of words, EfficientNet on ImageNet, Image Captioning

Books & More:

- Research Papers @ IEEE TIP, IEEE TPAMI, CVPR, etc.
- Computer Vision: Algorithms and Applications by Richard Szeliski
- Fundamentals of Digital Image Processing by Anil K. Jain
- Digital Video Processing by A. Murat Tekalp
- Digital Image Processing by Rafael C. GonzaLez and Richard E. Woods
- Image Processing for Cinema by Marcelo Bertalmío
- The Essential Guide to Video Processing by Alan C. Bovik

Class Timings:

[Open Slot] Thursdays: 6.00pm - 7.30pm, Fridays: 6.00pm - 8.30pm

Online Lecture Management:

- Google Classroom [invitation based, exclusive to those who officially register]

Page updated

Google Sites

Report abuse