The EVAC challenge relies on the recently introduced THERADIA WoZ dataset. The corpus data come from natural interactions in French involving healthy and Mild Cognitive Impairment (MCI) senior participants. The interactions revolve around Computerised Cognitive Training (CCT) exercises facilitated by a virtual assistant, and operated remotely by a human acting as a Wizard-of-Oz (WoZ). Expressions of the participants were then fully transcribed and partially annotated based on the dimensions of recent appraisal theories models, along with labels derived from the literature of achievement affects.
The segmentation of the interaction data was based on variations in breath groups, ensuring they form a single proposition if they convey the same semantic information, or separate propositions if they relate to different types of semantic information. Temporal markers for proposition start and stop were both positioned at a certain distance to allow sufficient observation space for annotation. All obtained audiovisual sequences were fully transcribed, and a fraction of those containing expressions of affect were specifically selected for fine-grained annotations.
The dataset was annotated with both categorical affective labels and dimensions that are relevant in the context of AI-assisted healthcare. Among the 23 labels given to the annotators, which are derived from the literature of achievement affects, a core set of ten affective labels (five positive, five negative) was carefully selected to represent frequently annotated affects with a high level of agreement between annotators and evidenced as highly relevant in the context of AI-assisted CCT. The dimensions consisted of the four most important cognitive appraisal criteria for affect recognition according to appraisal theory, namely: novelty, intrinsic pleasantness, goal conduciveness, and coping, plus arousal. In the context of appraisal theory, arousal is a marker of the depth of the appraisal process, which is relevant to qualify the intensity of the affective state. Dimensions were annotated both in continuous time and in a summarised form, while labels were only annotated in a summarised form in terms of presence and intensity.
The dataset was divided into three partitions: training, validation and test set, consisting of 56%, 23%, and 21% of the participants, respectively. A total of 1,110 sequences with an overall duration of 2h44m were selected for training, 851 videos with a total duration of 1h 51m for development, and 774 sequences with an overall duration of 1h44m were used for testing. These partitions were carefully defined to ensure that the statistic was preserved in the different population groups including the ratio of female vs male gender, the ratio of senior vs MCI participants, the ratio of participants where affective induction was considered vs. its absence, and distribution of education levels of the senior participants.
To download the dataset, please refer to the instructions provided on the download page of the Theradia WoZ Corpus website.