Abstract

The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) “State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect” was a satellite event of ACM MM 2019, (Nice, France, 21 October 2019), and the ninth competition aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual, and audio-visual health and emotion sensing, with all participants competing under strictly the same conditions.

The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the audio, visual and audio-visual affect recognition communities, to compare the relative merits of the approaches to automatic health and emotion analysis under well-defined conditions. Another motivation is the need to advance health and emotion recognition systems to be able to deal with fully naturalistic behaviour in large volumes of un-segmented, non-prototypical and non-preselected data, as this is exactly the type of data that both multimedia and human-machine/human-robot communication interfaces have to face in the real world.

We called for teams to participate in three Sub-Challenges:

State-of-Mind Sub-Challenge (SoMS)

The AVEC 2019 SoMS was a new task focusing on the continuous adaptation of human state-of-mind (SOM), which is pivotal for mental functioning and behaviour regulation. Human SOM constantly shifts due to internal and external stimuli, and habitual use of either adaptive or maladaptive SOM influences mental health. One key aspect of the human experience is our emotions, as they reflect our SOM. In the SoMS, self-reported mood (10-point Likert scale), after the narrative of personal stories (two positive and two negative), had to be predicted automatically from audio-visual recordings (USoM corpus). Performance was evaluated with the Concordance Correlation Coefficient (CCC).

Detecting Depression with AI Sub-Challenge (DDS)

The AVEC 2019 DDS was a major extension of the AVEC 2016 DSC, where the level of depression severity (PHQ-8 questionnaire) was assessed from audio-visual recordings of US Army veterans interacting with a virtual agent conducting a clinical interview and driven by a human as a Wizard-of-Oz (DAIC-WOZ corpus). The DAIC corpus contains new recordings with the virtual agent being, this time, fully driven by artificial intelligence, i.e., without any human intervention. Those new recordings were used as a test partition for the DDS and helped to understand how the absence of a human for conducting the virtual agent impacts on automatic depression severity assessment. Performance was evaluated with the Concordance Correlation Coefficient (CCC).

Cross-cultural Emotion Sub-Challenge (CES)

The AVEC 2019 CES was a major extension of the AVEC 2018 CES, where dimensions of emotion were inferred from audio-visual recordings collected “in-the-wild”, i.e., with standard webcams and at home/work place, in a cross-cultural setting: German culture => Hungarian culture (SEWA corpus). This dataset was extended to include data collected from new participants with Chinese culture, which was used as a test set for the AVEC 2019 CES to investigate how emotion knowledge of Western European cultures (German, Hungarian) can be transferred to the Chinese culture. Performance was evaluated with the Concordance Correlation Coefficient (CCC) for the Chinese culture, and was averaged over the emotional dimensions.

Contributions

We encouraged contributions aiming at highest performance w.r.t. the baselines provided by the organisers, and contributions aiming at finding new and interesting insights w.r.t. to the topic of these challenges, especially:

Multimodal Affect Sensing

Audio-based Health/Emotion Recognition
Video-based Health/Emotion Recognition
Physiological-based Health/Emotion Recognition
Multimodal Representation Learning
Transfer Learning
Semi-supervised and Unsupervised Learning
Multi-view learning of Multiple Dimensions
Personalised Health/Emotion Recognition
Context in Health/Emotion Recognition
Multiple Rater Ambiguity and Asynchrony

Application

Multimedia Coding and Retrieval
Mobile and Online Applications

Prices