Trimodal Fusion Approach

The objective of this project was to explore the till then relatively less explored territory of modeling truly Multimodal data (i.e. there are more than 2 modalities to fuse). In this work, our use-case was the prediction of psychological distress levels of individuals from their verbal (text) and non-verbal (acoustic and visual behavior). We designed a novel hierarchical classification technique. As shown in the figure, the first layer consists of classifiers on features of the individual modalities, while a second layer combines the posterior probabilities from the first layer using an Expectation-Maximization (EM) approach, to generate the final predictions.

The research was conducted in a team of two. The results were published in a conference of international repute.

Publications:

M. Chatterjee*, S. Ghosh*, L. P. Morency, "A Multimodal Context Based Approach for Distress Assessment", ACM Int'l Conf. on Multimodal Interfaces, 2014 (ACM ICMI 2014).

Non-Peer Reviewed:

M. Chatterjee, "Probabilistic Multimodal Fusion Approaches for Recognition Tasks", Technical Report, 2015.

[* - indicates equal contribution]

Project Funding Support:

We are extremely grateful to the US DARPA for funding this research project.

Page updated

Google Sites

Report abuse