Profs. Trevor Darrell and Alyosha Efros; trevor,firstname.lastname@example.org
Spring 2015CCN 26876
LOCATION AND TIME: Newton room, Room 730 Sutardja Dai Hall, Friday 10-12am
This course covers computer vision and machine learning techniques for object and category recognition, as well as recognition of human activity from video streams. Emphasis will be placed on recent techniques based on layered perceptual representation learning, a.k.a. "deep" learning. Recognition of individual objects or activities (the coffee cup on your desk, a particular chair in your office, a video of you riding your bike) or generic categories (any cup, chair, or cycling event) is an essential capability for a variety of robotics and multimedia applications. The advent of standardized datasets and evaluation regimes has spurred considerable innovation in this arena, with performance on benchmark evaluations increasing dramatically in recent years. This course reviews methods from the recent literature (past 6-9 months) that have achieved success on such challenge problems, and may also consider the techniques needed for real-time interactive application on robots or mobile devices, e.g. domestic service robots or mobile phones that can retrieve information about objects in the environment based on visual observation. This class is based exclusively on readings from the recent literature, including those appearing at the CVPR, ICCV, ECCV, ICML, and NIPS conferences.
The course includes both discussion of papers as well as tutorials covering implementation and experimentation using CAFFE. The class can be taken for variable units: 1 unit for presentation/participation only, 2 units for presentation and basic project, or 3 units for presentation and journal-quality publishable paper or major open-source CAFFE coding project. Graduate student auditors are discouraged, and will have the same presentation workload as those taking for credit. (Postdocs and others who do not normally register for CS graduate classes are welcome to audit on a space available basis, and will also have the same presentation workload.) Students who choose to take the course for additional units will be encouraged to propose projects which extend CAFFE in an new direction, construct a novel application in CAFFE, and/or replicates published methods in the CAFFE framework. Each week we will cover several papers related to the weekly topic. Students will be assigned (possibly as a small team) to cover one or more papers during the term. Where appropriate, presenters will also be encouraged to obtain or implement demonstration code for the presented methods, and port to the CAFFE framework if possible. Grades will be assigned based on the quality of in class presentation, implementation of associated methods, participation in discussions, and project quality (if applicable).
Prerequisite: Active research effort on related topic. Permission of instructor.
We will share ideas, code, and presentations through the course repository on github. Note that it is private to the course: send your github username to Evan Shelhamer (email@example.com) with subject "vis course github" for access.