Human-Robot Interaction:
Visual Imitation Learning
Course Overview
CS 933 in Spring 2023 is focused on Visual Imitation Learning (visual-IL) or learning from visual demonstration. The central idea of visual-IL is making a robot learn a task only from video demonstrations. Here is an example of our own YuMi robot learning to make a cup of tea from video demonstrations by humans.
Learning a task policy, both low-level (such as reaching, pulling, pushing) and high level (such as furniture assembly, cooking, activities of daily living), from video demonstration is extremely challenging due to a multitude of reasons including visual correspondence problem, very high-dimensional state space, and uncertainty in vision data. Despite being the most convenient way to teach a robot, visual-IL was not studied much due to these challenges. However, the recent rise in deep neural networks and their ability to process high-dimensional video data have triggered an intense recent interest in visual-IL. We will study the state-of-the-art in visual-IL in this course.
Learning Expectations
This is a seminar style course. We will read and analyze research papers on visual-IL. Depending on students' interests we may identify a particular sub-problem in visual-IL and work together to develop a solution.
At the end of this course you will
have a solid understanding of imitation learning as an AI research problem
know the state-of-the-art in visual IL
learn to implement a visual-IL algorithm on a physical robot
Since this is a seminar-style course, a general objective is to enhance your abilities to analyze scientific research in a systematic manner and communicate scholarly thoughts/comments/ideas to a peer audience in a well-accepted manner.
Textbook
There is no assigned textbook. All necessary materials will be provided in the class.
Grading
This is a project based course and the final project carries the most significant role in grading.