Learning From Unlabeled Videos

CVPR 2020 Workshop

Seattle, WA

  • Feb 17, 2020: Site goes live! CMT submission will open soon.
  • Dec 4, 2019: Site under construction

Overview

Deep neural networks trained with a large number of labeled images have recently led to breakthroughs in computer vision. However, we have yet to see a similar level of breakthrough in the video domain. Why is this? Should we invest more into supervised learning or do we need a different learning paradigm?

Unlike images, videos contain extra dimensions of information such as motion and sound. Recent approaches leverage such signals to tackle various challenging tasks in an unsupervised/self-supervised setting, e.g., learning to predict certain representations of the future time steps in a video­­ (RGB frame, semantic segmentation map, optical flow, camera motion, and corresponding sound), learning spatio-­temporal progression from image sequences, and learning audio­visual correspondences.

This workshop aims to promote comprehensive discussion around this emerging topic. We invite researchers to share their experiences and knowledge in learning from unlabeled videos, and to brainstorm brave new ideas that will potentially generate the next breakthrough in computer vision.

Invited Speakers

Alyosha Efros

UC Berkeley

Ivan Laptev

INRIA

Jitendra Malik

UC Berkeley / FAIR

Ming-Yu Liu

NVIDIA Research

Pierre Sermanet

Google Research

Organizers

Yale Song

Microsoft Research

Carl Vondrick

Columbia University

Katerina Fragkiadaki

CMU

Honglak Lee

University of Michigan / Google Research

Rahul Sukthankar

Google Research