Welcome to the 2015 Multimodal Machine Learning workshop homepage!

With the initial research on audio-visual speech recognition and more recently with language & vision projects such as image and video captioning, multimodal machine learning is a vibrant multi-disciplinary research field which addresses some of the original goals of artificial intelligence (AI) by integrating and modeling multiple communicative modalities, including linguistic, acoustic and visual messages. This research field brings some unique challenges for machine learning researchers given the heterogeneity of the data and the contingency often found between modalities. This workshop will bring together researchers from natural language processing, multimedia, computer vision, speech processing and machine learning to discuss the current challenges in multimodal machine learning and identify the research infrastructure needed to enable a stronger collaboration between multi-disciplinary researchers.


Louis-Philippe Morency

Aaron Courville

Tadas BaltruĊĦaitis

KyungHyun Cho

Important dates: