Research‎ > ‎

Multimodal Learning

The continued progress in digital data acquisition, storage and communication technology generates huge amounts of multimedia data (i.e., text, images, audio, video, among others), which are made widely available thanks to internet supported sharing platforms. This kind of information is a valuable source of knowledge, that can only be properly exploited with the development of effective tools that integrate the complementary information obtained from the different perspectives of this heterogeneous kind of data.

The main goal is to investigate effective and efficient methods to combine complementary evidence and model the relationships between multiple modalities (or views) of multimedia data in order to obtain valuable insights about the data and improve the performance in multiple tasks, such as content-based retrieval, exploration and automatic annotation and classification, among others.