Winter 2018

Location: Cruess Hall 107
Time: Tues & Thurs 12:10pm-1:30pm
Units: 4

Instructor: Yong Jae Lee
Email: yongjaelee@ucdavis  (email subject should begin with 
"[ECS 289G]")
Office: Academic Surge 2075
Office hours: By appointment

TA: Chongruo Wu
Email: crwu@ucdavis  (email subject should begin with 
"[ECS 289G]")
Office hours: By appointment

  • (1/8) Please read this website and the detailed course requirements and grading criteria very carefully.
  • (2/5) Final project proposal due on 2/8.
  • (3/2) Final project presentations on 3/13 & 15.
  • (3/2) Final project report due on 3/20.

Course Overview

This graduate seminar course will survey papers in a broad range of topics in computer vision, including object recognition, activity recognition, and scene understanding.  The course goals will be to understand and analyze state-of-the-art techniques, and to identify interesting open questions and future directions.  It should be of relevance to students interested in computer vision and machine learning.


A course in computer vision and a course in machine learning.  Programming will be required for the final project.  Please talk to me if you are unsure if the course is a good match for your background.


The bulk of the course will consist of student paper presentations.  Students will be responsible for writing paper reviews each week, participating in discussions, presenting once in class, and completing a final project.


We will use Canvas for paper review and project proposal/report submissions and grading.  Our class page:


The final grade will be determined by:
  • Paper reviews (25%)
  • Class participation (25%)
  • Paper presentation (25%)
  • Final project (25%)
Important Dates
  • 2/8: Final project proposal due
  • 3/13 & 15: Final project presentations
  • 3/20: Final project report due

Detailed course requirements and grading are here.

 Date  Papers  Presenters
 1/9  Introduction   Yong Jae Lee [pdf]

 1/11  Research Overview  Yong Jae Lee [pdf]

 1/16  PyTorch Tutorial  Chongruo Wu [pdf]

 1/18  Image Classification

 ImageNet classification with deep convolutional neural networksA. Krizhevsky, I. Sutskever, and G. E. Hinton. NIPS 2012.
 Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. CVPR 2016.
Markham Anderson + Baotuan Nguyen [pdf]

Xuanqing Liu + Wei-Pang Jan [pdf]

 Supervised Pretraining of Convolutional Neural Networks

 DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. ICML 2013.
 How transferable are features in deep neural networks? J. Yosinski, J. Clune, Y. Bengio, and H. Lipson. NIPS 2014. 
Prashant Gupta + Yash Bhartia [pdf]  

Haolin Yang + Kaiqi Zhang [pdf]

 1/25  Object Detection (Two stages)

 Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. R. Girshick, J. Donahue, T. Darrell, J. Malik. CVPR 2014. 
 Fast R-CNN. R. Girshick. ICCV 2015.
Grachya Hovhannisyan + Aman Asrani [pdf]

Doug Sherman + Nick Joodi [pdf]

 Object Detection (Single stage)

 SSD: Single Shot MultiBox Detector. Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. ECCV 2016
 You Only Look Once:Unified, Real-Time Object Detection. Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. CVPR 2016
 Hongjing Zhang + Chen Zhang [pdf]

Quan Zou + Liyang Zhong [pdf]

 2/1  CNN Visualization and Analysis

 Visualizing and Understanding Convolutional Networks. M. Zeiler and R. Fergus. ECCV 2014.
 Analyzing the Performance of Multilayer Neural Networks for Object Recognition. P. Agrawal, R. Girshick, J. Malik. ECCV 2014.
Robbie Sadre [pdf]

Yao Li + Suofei Wu [pdf]

 2/6  Google Cloud Tutorial / CNN Review Chongruo Wu [pdf]

 2/8  Final project proposal due

Fooling CNNs

 Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. A. Nguyen, J. Yosinski, J. Clune. CVPR 2015.
Countering Adversarial Images using Input Transformations. Chuan Guo, Mayank Rana, Moustapha Cisse, Laurens van der Maaten. arXiv 2017.
Bradley Wang [pdf]

Zainul Abi Din + Hari Venugopalan [pdf]

 2/13  Segmentation

Mask R-CNN. Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick. ICCV 2017.
 Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs.. Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille. ICLR 2015.
Xiaokang Wang + Mengyao Shi [pdf]  

Carlos Feres + Mark Weber [pdf]

 2/15  Weakly-Supervised Learning

Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-supervised Object and Action Localization. Krishna Kumar Singh and Yong Jae Lee. ICCV 2017.
 Weakly-supervised Visual Grounding of Phrases with Linguistic Structures. Fanyi Xiao, Leonid Sigal and Yong Jae Lee. CVPR 2017.
Yixian Wang + Ethan Hou [pdf]

Dian Yu + Weiming Wen [pdf]

 2/20  Self-Supervision

 Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery. Zhongzheng Ren, Yong Jae Lee. arXiv 2017.
 Look, Listen and Learn. R. Arandjelović, A. Zisserman. ICCV 2017.
Zhongzheng (Jason) Ren [pdf]

Gregory Rehm + Shahbaz Rezaei [pdf]

 2/22  Reinforcement Learning

Active Object Localization with Deep Reinforcement Learning. J. Caicedo and S. Lazebnik, ICCV 2015.
 Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning. Sangdoo Yun et al., CVPR 2016.
Chongruo Wu [pdf]

Sadegh Shamsabardeh + Ali Khodadadi [pdf]

 2/27  People 

 End-to-End Localization and Ranking for Relative Attributes. Krishna Kumar Singh and Yong Jae Lee. ECCV 2016.
 Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. Zhe Cao, Tomas Simon, Shih-En Wei and Yaser Sheikh. CVPR 2017.
Krishna Kumar Singh [pdf]

Suraj Kesavan + Priscilla Jennifer Jean Pierre [pdf]

 3/1  Neural Network Art

 Image Style Transfer Using Convolutional Neural NetworksL. Gatys, A. Ecker, M. Bethge. CVPR 2016.
Universal Style Transfer via Feature Transforms. Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu and Ming-Hsuan Yang. NIPS 2017.
Zach Harris + Felix Portillo [pdf]

Trevor Chan + Ibrahim Ahmed [pdf

 3/6  Generative Adversarial Networks

Image-to-Image Translation with Conditional Adversarial Networks. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. CVPR 2017.
StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. Yunjey Choi, et al.,  arXiv 2017.
Mingyang Zhou + Terry Yang [pdf

Tina Mashhour [pdf

 3/8  No class

 3/13  Final Project Presentations

 3/15  Final Project Presentations  


This course has been inspired by the following courses:

Subpages (1): Requirements