The Micro-Action Analysis Grand Challenge focuses on computer vision and machine learning methods for automatic human behavior based on whole-body micro-action that is bound up with psychological and mental state and emotion state analysis. Micro-actions are spontaneous body movements that indicate a person's true feelings and potential intentions, yet recognizing, distinguishing, and understanding micro-actions are challenging because they are subtle and appear for milliseconds compared to normal actions. To address these challenges, we successfully organized the 1st Micro-Action Analysis Grand Challenge (MAC 2024@ACM MM 2024) and 2nd Micro-Action Analysis Grand Challenge (MAC 2025@ACM MM 2025), attracting over 100 teams worldwide, demonstrating the growing interest in this emerging field. Building on this momentum, we are excited to announce the 3rd Micro-Action Analysis Grand Challenge (MAC 2026).
This challenge aims to foster innovative research in this emerging domain and provide benchmark evaluations to stimulate new approaches for utilizing whole-body micro-actions in human behavior understanding. Ultimately, our goal is to promote technological advancements in deep psychological assessment and emotional state analysis, and to inspire interdisciplinary collaboration within the research community.
MAC 2025 focuses on the recognition and detection of micro-action, this challenge aims to develop and benchmark models that are capable of human micro-action recognition (MAR), multi-label micro-action detection (MMAD) and fine-grained micro-action understanding (FMAU), in preparation for exploring the relationship between micro-actions and human emotions.
Micro-Action Recognition (MAR) aims to recognize and distinguish subtle body actions that typically occur in a brief instant. The MAR task is similar to conventional action recognition, as it involves using video instances as input and requires precise and efficient algorithms. However, it is uniquely complex due to the presence of low-amplitude fluctuations in gestures and postures.
Considering the co-occurrence of human micro-actions, i.e., the same micro-action may be repeated in time and different micro-actions may occur at the same time, Multi-label Micro-Action Detection (MMAD) is necessary for a deeper understanding of human bodily behavior. Multi-label Micro-Action Detection (MMAD) refers to the task of identifying and localizing all micro-actions in a given uncut and densely annotated video, determining their corresponding start and end times, as well as their categories. This task takes an entire video as input and requires a model capable of accurately capturing both long-term and short-term action relationships to detect and locate multiple micro-actions. Designing a model for MMAD is more challenging due to the brief duration and small magnitude of micro-actions.
Inspired by the rapid advancement of multimodal large language models, which demonstrate strong capabilities in visual understanding and reasoning, Fine-grained Micro-Action Understanding (FMAU) aims to evaluate whether MLLMs can perceive, compare, and reason about subtle micro-actions. Formulated as a video question answering task, it takes a micro-action video and a query as input, requiring answers in multiple formats (e.g., multiple-choice, Yes/No, or open-ended). This track is further organized into three tiers—perceptual recognition, relational comprehension, and interpretive reasoning—making it more challenging as it requires deeper reasoning over subtle motion cues and complex dependencies.
7, Mar. 2026 Our Grand Challenge Proposal is accepted for ACM MM 2026!
The contact info is listed as follows:
• For questions regarding the general issue, please get in touch with guodan@hfut.ed.cn
• For questions regarding the challenge, please contact both kunli.hfut@gmail.com and guodan@hfut.ed.cn