Fall 2016

Location: Haring Hall 2016
Time: Tues & Thurs 6:10pm-7:30pm
Units: 4

Instructor: Yong Jae Lee
Email: yjlee@cs  (email subject should begin with 
"[ECS 289G]")
Office: Academic Surge 2075
Office hours: By appointment



Announcements
  • Final project presentations on 11/29, 12/1.  See here for more details.
  • Final project report due 12/6, 11:59 pm.  Please email your report in pdf format to the instructor.  See here for more details.


Course Overview

This graduate seminar course will survey papers in a broad range of topics in computer vision, including object recognition, activity recognition, and scene understanding.  The course goals will be to understand and analyze state-of-the-art techniques, and to identify interesting open questions and future directions.  It should be of relevance to students interested in computer vision and machine learning.


Prerequisites

A course in computer vision and a course in machine learning.  Programming will be required for the final project.  Please talk to me if you are unsure if the course is a good match for your background.


Requirements

The bulk of the course will consist of student paper presentations.  Students will be responsible for writing paper reviews each week, participating in discussions, presenting once or twice in class (depending on enrollment), and completing a final project.


Grading

The final grade will be determined by:
  • Paper reviews (25%)
  • Class participation (25%)
  • Paper presentation (25%)
  • Final project (25%)
 
Important Dates
  • 10/25: Final project proposal due
  • 12/6: Final project report due



Detailed course requirements and grading are here.





Schedule
 
 Date  Papers  Presenters
 9/22  Introduction   Yong Jae Lee [pdf]

 9/27  Research Overview  Yong Jae Lee [pdf]

 9/29  CNN basics/Caffe Tutorial  Fanyi Xiao [pdf]
 Krishna Kumar Singh [pdf]
  

 10/4  Image Classification

 ImageNet classification with deep convolutional neural networksA. Krizhevsky, I. Sutskever, and G. E. Hinton. NIPS 2012.
 Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. CVPR 2016.
 Yuanzhe Li [pdf]
 Xiaoyun Wang [pdf]

 10/6
 Supervised Pretraining of Convolutional Neural Networks

 DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. ICML 2013.
 How transferable are features in deep neural networks? J. Yosinski, J. Clune, Y. Bengio, and H. Lipson. NIPS 2014. 
 Huan Zhang [pdf]
 Shubo Chen [pdf]

 10/11  Object Detection

 Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. R. Girshick, J. Donahue, T. Darrell, J. Malik. CVPR 2014. 
 Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object DetectionKrishna Kumar Singh, Fanyi Xiao, and Yong Jae Lee.  CVPR 2016.
 Mohamed Alkaoud [pdf] Oleg Igouchkine [pdf]

 10/13
 CNN Visualization and Analysis (I)

 Visualizing and Understanding Convolutional Networks. M. Zeiler and R. Fergus. ECCV 2014.
 Analyzing the Performance of Multilayer Neural Networks for Object Recognition. P. Agrawal, R. Girshick, J. Malik. ECCV 2014.
 Jason Driver [pdf
 Patrick Chen [pdf

 10/18  CNN Visualization and Analysis (II)

 Object Detectors Emerge in Deep Scene CNNs. Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba. ICLR 2015. 
 What makes ImageNet good for transfer learning? Minyoung Huh, Pulkit Agrawal, Alexei A. Efros. arXiv 2016.
 Collin Mccarthy [pdf
Ismail [pdf
 

 10/20  Fooling CNNs

 Intriguing properties of neural networks. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus. ICLR 2014.
 Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. A. Nguyen, J. Yosinski, J. Clune. CVPR 2015.
 Juran Zhang [pdf]  
 Zhenghao Fei [pdf]

 10/25  Final project proposal due

 Segmentation


 Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell. CVPR 2015.
 Instance-sensitive Fully Convolutional NetworksJifeng Dai, Kaiming He, Yi Li, Shaoqing Ren, and Jian Sun. ECCV 2016. 
 Zilong Bai [pdf]

 10/27  Action Recognition

 Learning Spatiotemporal Features with 3D Convolutional NetworksD. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. ICCV 2015
 Actions~Transformations. X. Wang, A. Farhadi, and A. Gupta. CVPR 2016.  
 Leonardo Ferrer [pdf]
 Chongruo Wu [pdf]

 11/1  Self-Supervision (I)

 
Unsupervised Visual Representation Learning by Context Prediction. C. Doersch, A. Gupta, A. Efros. ICCV 2015.
 Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles. M. Noroozi, P. Favaro. ECCV 2016
 Yisha Sun [pdf]
 Chenshan Yuan [pdf]

 11/3  Self-Supervision (II)

 
Ambient Sound Provides Supervision for Visual Learning. A. Owens, J. Wu, J. McDermottt, W. Freeman, A. Torralba. ECCV 2016.
 Colorful Image Colorization. Richard Zhang, Phillip Isola, Alexei A. Efros. ECCV 2016.
 Baotuan Nguyen [pdf]
 Suraj Kesavan [pdf

 11/8  Neural Network Art

 Texture synthesis and the controlled generation of natural stimuli using convolutional neural networks. L. Gatys, A. Ecker, M. Bethge. NIPS 2015.
 Image Style Transfer Using Convolutional Neural Networks. L. Gatys, A. Ecker, M. Bethge. CVPR 2016.
 Chang Gou [pdf]  Wenjian Hu [pdf

 11/10  Attributes

 End-to-End Localization and Ranking for Relative Attributes. K. Singh and Y. J. Lee. ECCV 2016.
 Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data. J. Wang, Y. Cheng, and R. Feris. CVPR 2016.
 Minhao Cheng [pdf
 Devika Joshi [pdf

 11/15  No class  
 

 11/17  Recurrent Neural Networks

 
Recurrent neural network based language model. T. Mikolov, M. Karafiat, L. Burget, J. Cernock, S. Khudanpur. Interspeech 2010.
 
Visualizing and Understanding Recurrent Networks. A. Karpathy, J. Johnson, L. Fei-Fei. ICLR 2016 Workshop.
 Vincent Hellendoorn [pdf]
 Ismail [pdf]

 11/22  Active Perception

 The Curious Robot: Learning Visual Representations via Physical Interactions. L. Pinto, D. Gandhi, Y. Han, Y-L. Park, and A. Gupta. ECCV 2016. 
 Look-Ahead Before You Leap: End-to-End Active Recognition by Forecasting the Effect of Motion. D. Jayaraman and K. Grauman. ECCV 2016. 
 Chen Peng [pdf]
 
Jason Ren [pdf]

 11/24  Thanksgiving (no class)  

 11/29  Final Project Presentations

 12/1  Final Project Presentations  





Acknowledgements

This course has been inspired by the following courses:

Subpages (1): Requirements