REACT2025 Challenge Contest Rules
These are the official rules that govern how the third “Multiple Appropriate Facial Reaction Generation in Dyadic Interactions” challenge (REACT2025), to be held in conjunction with the ACM Multimedia (ACM-MM) 2025, henceforth simply referred to as the challenge, will operate.
1. Challenge Description
This is a skill-based contest and chance plays no part in the determination of the winner(s). Each participant team is encouraged to develop a Deep Learning framework that can generate multiple appropriate spatio-temporal facial reactions from each input speaker behaviour. Participants can choose to work on either Task 1, Task 2, or both tasks associated with this contest, as described below:
This task aims to develop a machine learning model that takes the entire speaker behaviour sequence as input, and generates multiple appropriate and realistic/naturalistic spatio-temporal facial reactions, consisting of AUs, facial expressions, valence and arousal state representing the predicted facial reaction. As a result, facial reactions are required to be generated for the task given each input speaker's behaviour.
Input: the multi-modal speaker behaviour clips.
Output: (i) Multiple (default ten) 25-channel time-series representing ten predicted facial reactions, where each time-series consists of the occurrence (0 or 1) of 15 facial action units (i.e., AU1, AU2, AU4, AU6, AU7, AU9, AU10, AU12, AU14, AU15, AU17, AU23, AU24, AU25 and AU26), 2 facial affects - valence and arousal intensities (range from -1 to 1) - and the probabilities (range from 0 to 1) of eight categorical facial expressions (i.e., Neutral, Happy, Sad, Surprise, Fear, Disgust, Anger and Contempt); and (ii) visualisation of the generated facial reactions, i.e., 2D-Videos (frame-wise sequences) / 3D-Animation (3DMM-coefficients sequences).
Input: the speaker audio-visual behaviours expressed prior and equal to the time t
Output: (i) frame-level facial attributes (15-d AUs’ occurrences, 2-d valence and arousal intensities, and the probabilities of 8-d categorical facial expressions) representing the predicted t_{th} facial reaction frame; (ii) Once all frames of a speaker audio-visual clip are fed to the model, all predicted facial reaction frames are combined as a 25-channel time-series representing predicted facial reaction sequences, where ten facial reactions are required to be predicted from each speaker behaviour clip; and (iii) visualisation of the generated facial reactions, i.e., 2D-Videos (frame-wise sequences) / 3D-Animation (3DMM-coefficients sequences).
1.3 Dataset Description
We divided the datasets into training, test, and validation sets following an estimated 60%/20%/20% splitting ratio. Specifically, we split the datasets with a subject-independent strategy (i.e., the same subject was never included in the train and test sets).
1.3.1 Dataset Directory Structure: (training and validation sets are provided at this stage)
Explanation:
(1) video-raw folder contains raw videos (with the resolution of 1920 * 1080)
(2) video-face-crop folder contains face-cropped videos (with the resolution of 384 * 384)
(3) facial-attributes folder contains sequences of frame-level 25-dimension facial attributes (15 AUs’ occurrences, valence and arousal intensities, and the probabilities of eight categorical facial expressions)
(4) coefficients folder contains sequences of 58-dimension (52-d expression, 3-d rotation, and 3-d translation) 3DMM coefficients extracted from corresponding videos
(5) audio folder contains wav files extracted from raw video files
======================================================
./data
├── train
├── video-raw (.mp4)
├── speaker
├── session0
├── Camera-2024-06-21-103121-103102.mp4
├── …
├── session22
├── Camera-2024-07-17-104338-104241.mp4
├── …
├── …
├── listener
├── session0
├── Camera-2024-06-21-103121-103102.mp4
├── …
├── session22
├── Camera-2024-07-17-104338-104241.mp4
├── …
├── …
├── video-face-crop (.mp4)
├── facial-attributes (.npy)
├── speaker
├── session0
├── Camera-2024-06-21-103121-103102.npy
├── …
├── session22
├── Camera-2024-07-17-104338-104241.npy
├── …
├── …
├── listener
├── session0
├── Camera-2024-06-21-103121-103102.npy
├── …
├── session22
├── Camera-2024-07-17-104338-104241.npy
├── …
├── …
├── coefficients (.npy)
├── audio (.wav)
├── speaker
├── session0
├── Camera-2024-06-21-103121-103102.wav
├── …
├── …
├── listener
├── session0
├── Camera-2024-06-21-103121-103102.wav
├── …
├── …
├── val
├── test
======================================================
1.3.2 Appropriate real facial reactions (Ground-Truths):
During data recording, the semantic contexts are carefully controlled through the 23 distinct sessions (session0, session1, …, session22), each of which is guided by a few pre-defined sentences posted by the speaker. This provides a consistent session-specific context across dyadic interactions between different speakers and listeners. More specifically, for the speaker behaviour expressed in a specific session, we define all facial reactions expressed by different listeners under the same session to be appropriate facial reactions (i.e., ground-truth) for responding to it.
The method for both tasks will be evaluated based on metrics described below.
We follow [1] to evaluate four aspects of the facial reactions generated by participant models:
(i) their Appropriateness that are measured by two metrics: Dynamic Time Warping (DTW) and Concordance Correlation Coefficient (CCC) between the generated facial reactions and their most similar appropriate real facial reaction, which are named as FRDist and FRCorr, respectively;
(ii) their inter-condition and inter-frame Divers
ities that are measured by FRVar, FRDiv and FRDvs metrics defined in [1], respectively;
(iii) their Realism that are measured by Fréchet Inception Distance (FID) between the distribution of the generated facial reactions and the distribution of the corresponding appropriate real facial reactions (named as FRRea);
(iv) their Synchrony with the corresponding speaker behaviour, which is measured by Time Lagged Cross Correlation (TLCC), (named as FRSyn).
Participants are required to submit their developed model and weights. Specifically, during the development stage, participants need to submit their results, and then during the test stage, they need to submit their model and weights. The ranking of the submitted model competing in the Challenge relies on the two metrics: Appropriate facial reaction distance (FRDist) and facial reactions' diverseness FRDiv, for both sub-challenges.
2. Tentative Contest Schedule
The registered participants will be notified by email of any change in the following tentative schedule. Please check the REACT2025 challenge website for updated information:
Launching Challenge website and call for participation poster: March 10, 2025
Registration open: March 10, 2025
Training and validation sets released: March 31, 2025
Baseline paper and code released: May 22, 2025
Model submission opening: May 26, 2025
Final result and model submission deadline: June 26, 2025
Paper submission deadline: June 30, 2025
Paper acceptance notification: July 24, 2025
Camera-ready paper submission deadline: August 26, 2025
Challenge workshop: Octorber 2025 (TBD)
3.Eligibility
You are eligible to enter this contest if you meet the following requirements:
You are an individual or a team of people desiring to contribute to the tasks of the challenge and accepting to follow its rules;
You are employed by a non-profit organisation or academic research institution;
You are not involved in any part of the administration and execution of this contest;
You are not an immediate family (parent, sibling, spouse, or child) or household member of a person involved in any part of the administration and execution of this contest.
This contest is void wherever prohibited by law. If you choose to submit an entry, but are not qualified to enter the contest, this entry is voluntary, and any entry you submit is governed by the remainder of these contest rules; the organisers of the challenge reserve the right to evaluate it for scientific purposes. If you are not qualified to submit a contest entry and still choose to submit one, under no circumstances will such entries qualify for sponsored prizes, if any.
4. Entry
To be eligible for judging, an entry must meet the following content/technical components:
During the period of the challenge, challenge participants are required to submit their results via email (development stage) and their code and trained models via email (test stage). At a later stage, defined in the competition schedule they are required to share their code with complete instructions to enable reproducibility of the results. Participants are required to publicly release their code to be eligible as winners.
To participate, participants are required to fill-in the registration form on the REACT 2025 official website.
The Multi-modal Multiple Appropriate Reaction in Social Dyads (MARS) Dataset is the first multi-modal dataset that is specifically collected for MAFRG tasks. It comprises 137 human-human dyadic interaction audio-visual-EEG clips recorded from 23 speakers and 137 listeners. Participants will receive the EULAs and License after filling in the registration form on the REACT 2025 official website, along with instructions on how to submit the filled-in EULAs and License. As described in the EULAs and License, the data are available only for non-commercial research and educational purposes, within the scope of the challenge. Participants may only use the REACT 2025 data for the purpose of participating in this challenge. The copyright of the REACT 2025 data remains in property of its owners. By downloading and making use of the REACT 2025 data, you accept full responsibility for using the data and accept the rules specified in the EULAs and License of the underlying datasets. You shall defend and indemnify the challenge organisers and affiliated organisations against any and all claims arising from your use of the REACT 2025 data. You agree not to transfer, redistribute, or broadcast the REACT 2025 data or portions thereof in any way, and to comply with the EU/UK General Data Protection Regulations (GDPR). Users may use portions or the totality of the REACT 2025 data, provided they acknowledge such usage in their publications by citing the corresponding baseline papers provided on this page. By signing an EULA to access the dataset, you agree to strictly respect the conditions set therein.
The entries of the participants will be submitted online via email (code, weights, and results during the test stage). Participants will get quick feedback on validation data released for practice during the development phase. The participants will get quick feedback on the test results throughout the testing period. Keep in mind that the performances on test data will be examined once the challenge is over during a step of code verification. Additionally, the limit for submissions per participant during the test stage will be set at three. It is not permitted for participants to open more than one account to submit more than one entry. Any suspicious submissions that do not adhere to this criteria may be excluded by the organizers. The final list of winning techniques will only include entries that pass the code verification.
For task 2, participants are only allowed to use information from the past. This behaviour will be checked in the code verification stage, and solutions that break this rule will be disqualified.
We are not asserting any ownership rights over your entry other than what is stated below.
In exchange for the chance to participate in the competition and potential prize payouts, you're granting us an irrevocable, worldwide right and licence to:
Use, review, evaluate, test, and otherwise assess results provided or produced by your code and other materials provided by you in connection with this competition and any upcoming research or contests sponsored by;
Accept to sign any paperwork that may be necessary for us and our designees to use the rights you granted above;
Use your entry and all of its content in connection with the marketing of this contest in all media (now known or subsequently developed);
If you do not want to grant us these rights to your entry, please do not enter this contest.
Based on the test results and code verification score, the competition winners will be chosen. We will nominate judges who are experts in causality, statistics, machine learning, computer vision, or related disciplines, as well as the experts in challenge organization. All judges will be prohibited from participating in the competition. On request, a list of the judges will be provided. The judges will evaluate all qualifying submissions and choose up to three winners for each track based on the metrics defined in the Evaluation section. The judges will check that the winners followed the requirements.
We will contact the participants via email for any communications. Participants who have registered will receive notification via the email address they supplied upon registration if there are any changes to the data, schedule, participation instructions, or rules.
We reserve the right to cancel, modify, or suspend this contest if an unforeseeable or unexpected event (such as, but not limited to: cheating; a virus, bug, or catastrophic event corrupting data or the submission platform; someone discovering a flaw in the data or modalities of the challenge) affects the fairness and/or integrity of the contest. This is known as a "force majeure" event. Regardless of whether a mistake was made by a human or a machine, this right is reserved.
The personal data required to fill-in the registration form will be stored and processed in accordance with EU/GDPR for the purpose of participating in the challenge, is meant for internal use only and will not be shared with third parties. We will use this information to verify the participants eligibility and contact them throughout the challenge period and subsequent workshop. The organisers will retain the provided information for as long as needed to proceed with the challenge and subsequent workshop.
Note that the participants data needed to request formal dataset access to the underlying datasets is considered a different set of personal data from the personal data described above, and as such it follows different rules and lawful basis of data processing. The right of information of such data is described in the respective EULAs and/or Licence.
DISCLAIMER
ALL INFORMATION, SOFTWARE, DOCUMENTATION, AND DATA ARE PROVIDED “AS-IS”. THE ORGANIZERS DISCLAIM ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF ERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL CHALEARN AND/OR OTHER ORGANIZERS BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF SOFTWARE, DOCUMENTS, MATERIALS, PUBLICATIONS, OR INFORMATION MADE AVAILABLE FOR THE CHALLENGE.