Overview

Prof. Trevor Darrell, trevor@eecs.berkeley.edu

Spring 2012

LOCATION AND TIME CHANGE: Newton room,  7th floor Sutardja Dai Hall, Friday 9:30-11:30am. 

PLEASE SIGN UP TO THE GOOGLE GROUP UCB CS294-43 Reading Group at https://groups.google.com/group/ucbcs29443 FOR EMAIL UPDATES.

This course historically covers computer vision techniques for object and category recognition, as well as recognition of human activity from video streams.  Recognition of individual objects or activities (the coffee cup on your desk, a particular chair in your office, a video of you riding your bike) or generic categories (any cup, chair, or cycling event) is an essential capability for a variety of robotics and multimedia applications.  The advent of standardized datasets and evaluation regimes has spurred considerable innovation in this arena, with performance on benchmark evaluations increasing dramatically in recent years.  This course reviews methods that have achieved success on such datasets, and will also consider the techniques needed for real-time interactive application on robots or mobile devices, e.g. domestic service robots or mobile phones that can retrieve information about objects in the environment based on visual observation.  This class is be based exclusively on readings from the recent literature, including those appearing at the CVPR, ICCV, and NIPS conferences.

This year's course will have expanded subject matter, and will cover object and activity learning for perceptually grounded robotic language acquisition and execution.  We will focus on multimodal perception, including visual, acoustic, and haptic sensing.  We will be reading papers relevant to the Berkeley and UPenn BOLT-E project, and/or from the recent NIPS workshop on language and vision.  Specific topics will include:

  • Perceptually grounded language 
  • Contemporary vision and language models
  • Structured activity models 
  • Haptic and acoustic grounding frameworks
  • Grasping and interactive learning
  • Bayesian learning of concept models 
  • Models of spatial relations and navigation using language
  • Robotic activity Learning-by-Demonstration 
  • Domain adaptation and Learning from the web
  • Sub-category recognition and lexical hierarchies

Please click here for the current syllabus.

The format of the course this year will primarily be discussion based, with each class beginning with a short overview of the topic by the instructor followed by student-led presentations and structured critique of selected papers.    Class size will be limited to foster an environment conducive to discussion. 

Students are expected to be involved in a related research project during the term, and to make a research presentation at the end of the class.  Units are variable depending on scope of research project undertaken by student. 

Prerequisite: Active research effort on related topic. Permission of instructor.