Multi-Modal Learning from Videos

CVPR 2019, LONG BEACH, June 16, 2019

"Multisensory Integration (also known as multimodal integration) describes a process by which information from different sensory systems is combined to influence perception, decisions, and overt behavior."



Video data is explosively growing as a result of ubiquitous acquisition capabilities. The videos captured by smart mobile phones, from ground surveillance, and by body-worn cameras can easily reach the scale of gigabytes per day. While the "big video data" is a great source for information discovery and extraction, the computational challenges are unparalleled. Intelligent algorithms for automatic video understanding, summarization, retrieval, etc. have emerged as a pressing need in such context. Progress on this topic will enable autonomous systems for quick and decisive acts based on the information in the videos, which otherwise would not be possible.

Call for Papers

We call for papers and ideas on but not limited to the following topics.

  • Multi-modal action classification and localization
  • Self-supervised representation learning from videos
  • Auditory scene perception
  • Understanding movies and books
  • Video generative model
  • Video captioning
  • Video reasoning
  • Video and wireless
  • Multi-modal visual navigation
  • Multi-modal robotic planing
  • Video summarization, especially for
    • the emerging new types of videos such as VLOGs, hours long ego-centric videos, real-time video streaming, etc.
    • benchmark datasets and novel evaluation methods
    • query-focused and interactive summarization
    • multimodal summaries (e.g., short clips, GIFs, text, etc.)


We encourage submissions of up to 6 pages (excluding references, in the CVPR 2019 camera-ready format) for the work unpublished before. We also encourage submissions of up to 2 pages (excluding references, in the CVPR 2019 camera-ready format) for the published work that is related to this workshop. Reviewing will be single blind. Accepted extended abstracts will be made publicly available through the CVF open access archive.

Submission link:

Important Dates

  • Submission deadline: Extended to April 16, 2019
  • Notification to authors: Extended to April 27, 2019
  • Camera ready deadline: Extended to May 10, 2019