Efficient Video Understanding: State-of-the-art, Challenges, and Opportunities

October 17, 2021, 12:00 - 16:00 EDT| Virtual, in conjunction with ICCV 2021

Recording of all the talks and slides are available on the program page.

Widespread visual sensors and unprecedented connectivity have left us awash with video data -- ranging from consumer home videos to enterprise video data across many different industries, including media and entertainment, healthcare, and safety/security. Much progress has been made in computer vision to automatically understand this complex video content. However, despite impressive results on commonly used benchmark datasets, efficiency remains a great challenge due to the high computation, memory, energy and labeled data requirement of deep video understanding models. This poses an issue for deploying these models in many resource-limited applications such as autonomous vehicles and mobile platforms, and domains where fast inference is essential, such as video analysis for media and entertainment. Motivated by the need of efficiency, extensive studies have been recently conducted in computer vision that focus on designing compact models for computation efficiency, learning from unlabeled videos for data or label efficiency and neural architecture search for design efficiency. Moreover, new important research topics and problems are also recently appearing, (1) efficient multi-modal learning, (2) dynamic neural networks, (3) self-supervised learning for videos, (4) design of specialized hardware, (5) automatic design of architectures, and (6) energy efficient computing for green AI. While these recent works and problems are opening up new paths forward, our understanding on different aspects of efficiency in video understanding remains far from complete. The objective of this tutorial is to present the audience with a unifying perspective of efficient video understanding from both theoretical and application standpoint, as well as to discuss state-of-the-art, challenges, motivate and encourage future research and opportunities that will spur disruptive progress in the the emerging field of efficient computer vision for democratizing AI technology.


Contact: For any questions or feedback, please contact Rameswar Panda at rpanda@ibm.com