Learning From Unlabeled Videos

CVPR 2020 Workshop

  • Dec 4, 2019: Site under construction


Deep neural networks trained with a large number of labeled images have recently led to breakthroughs in computer vision. However, we have yet to see a similar level of breakthrough in the video domain. Why is this? Should we invest more into supervised learning or do we need a different learning paradigm?

Unlike images, videos contain extra dimensions of information such as motion and sound. Recent approaches leverage such signals to tackle various challenging tasks in an unsupervised/self-supervised setting, e.g., learning to predict certain representations of the future time steps in a video­­ (RGB frame, semantic segmentation map, optical flow, camera motion, and corresponding sound), learning spatio-­temporal progression from image sequences, and learning audio­visual correspondences.

This workshop aims to promote comprehensive discussion around this emerging topic. We invite researchers to share their experiences and knowledge in learning from unlabeled videos, and to brainstorm brave new ideas that will potentially generate the next breakthrough in computer vision.