This ICMI 2025 workshop seeks to establish an international research platform to investigate the impact of linguistic and cultural differences on nonverbal behavior and their effects on communication dynamics. Moving beyond merely identifying nonverbal behavior patterns in specific cultural contexts, the workshop aims to uncover the mechanisms behind adaptation, change, and misunderstanding in intercultural interactions. The first year will focus on data-related challenges, such as collecting and annotating high-quality data across different regions. While advances in sensor technology, machine learning, and Large Language Models (LLMs) have been applied to linguistic diversity, their use in nonverbal communication remains underexplored. Given the known cultural variations in gestures, facial expressions, and turn-taking, integrating insights from humanities research with multimodal analysis is crucial. As LLMs continue to shape human-machine interactions globally, understanding and incorporating cultural differences in nonverbal behavior is an urgent and significant research challenge.
The workshop program has been released 👍 (10th September 2025)
We have extended the paper deadline until July 12th (9th July 2025)
Another invited talk has been determined, Yukiko I. Nanako from Seikei University 🎉 (4th, July 2025)
Submittion instructions have been updated ✍️ (23rd, May 2025)
One of the invited talks has been determined, Abhinav Dhall from Monash University 🎉(14th, May 2025)
The workshop has been accepted for ICMI 2025! See you in Canberra🙋 (10th, March 2025)
It includes (but is not limited to) the following:
Cross-cultural differences in nonverbal communication
Data collection methodologies for intercultural interaction
Annotation schema for multimodal data
Cultural variations in gestures, facial expressions, back-channeling, and turn-taking
Integration of humanities research insights into multimodal analysis
Applications of Large Language Models (LLMs) to nonverbal communication
Multimodal machine learning and signal processing methods for detecting cultural differences
Cross-linguistic differences and universals in turn-taking behavior
Technological innovations supporting intercultural human-machine interaction
Interdisciplinary research bridging humanities, social sciences, and multimodal computing
Standardized protocols for intercultural multimodal data collection
Sharing and harmonizing multimodal datasets and analysis techniques across international research groups
Paper Submission July 7th 12th, 2025
Paper Notification August 7th, 2025
Camera Ready August 24th, 2025
Workshop Day October 13th, 2025
We invite both long papers (up to 8 pages) and short papers (up to 4 pages), all formatted in the double-column ACM conference style used by ICMI 2025. Pages consisting solely of references do not count toward the page limit for either paper type. All submissions must be prepared for double-blind review and submitted as PDF files via OpenReview.
Submission Guidelines for Authors (ICMI 2025):
https://icmi.acm.org/2025/guidelines/
Submission Site:
https://openreview.net/group?id=ACM.org/ICMI/2025/Workshop/CCMI
You will need an OpenReview account, and we strongly recommend creating one as soon as possible if you haven't already.
Please note the following:
New profiles created without an institutional email will go through a moderation process that can take up to two weeks.
New profiles created with an institutional email will be activated automatically.
Cross-cultural studies on human-human and human-agent interaction
Cultural diversity in communication has been studied across various disciplines, including communication science, linguistics, and social science. However, to conduct cross-cultural studies of multimodal interaction, simply collecting diverse communication data is not enough, but a careful design of data collection is required. In this talk, I will introduce methods for collecting cross-culturally comparable data and present findings that reveal cultural differences in human communication. In the context of human-agent interaction, understanding cultural differences is important for designing culturally adaptive and socially acceptable agents (e.g., virtual agents, robots). I will also showcase how people from different cultural backgrounds perceive agent behaviors. Finally, I will discuss how theories and definitions of culture can serve as a foundation for research on cross-cultural multimodal interaction.
Multimodal Deepfake Detection Across Cultures and Languages
The growing accessibility of Generative AI based image and video manipulation tools has made the creation of deepfakes easier. This poses significant challenges for content verification and can spread misinformation. In this talk, we explore multimodal approaches that are inspired from user behavior for detecting and localizing manipulations in time. A key focus of our work is on multilingual and multicultural aspects of deepfake detection. Our research draws on user studies, including those focusing on multicultural deepfakes, which provide insights into how different audiences perceive and interact with manipulated media. We discuss the findings from the ACM Multimedia 2025 One Million Deepfakes Detection benchmark. These insights give directions for future works in the area of deepfakes analysis in globally diverse contexts.
October 13th
13:15-13:20 Opening and Welcome
13:20-13:55 Session 1 (Chair: Koji Inoue)
13:20-13:40 Benchmarking Visual Generative Models through Cultural Lens: A Case Study with Singapore-Centric Multi-Cultural Context (Ali Koksal, Loke Mei Hwan, Hui Li Tan, Nancy F. Chen )
13:40-13:55 Culture-Aware Multimodal Personality Prediction using Audio, Pose, and Cultural Embeddings
13:55-14:30 Invited Talk 1: Multimodal Deepfake Detection Across Cultures and Languages (Abhinav Dhall)
14:30-15:00 Break
15:00-16:10 Session 2 (Chair: Shogo Okada)
15:00-15:20 Multimodal grounding in HRI using two types of nods in Japanese and Finnish (Taiga Mori, Kristiina Jokinen, Leo Huovinen, Biju Thankachan)
15:20-15:40 Analyzing Multimodal Multifunctional Interactions in Multiparty Conversations via Functional Spectrum Factorization (Momoka Tajima, Issa Tamura, Kazuhiro Otsuka)
15:40-15:55 MultiGen: Child-Friendly Multilingual Speech Generator with LLMs (Xiaoxue Gao, Huayun Zhang, Nancy F. Chen)
15:55-16:10 Contextualized Visual Storytelling for Conversational Chatbot in Education (Hui Li Tan, Gu Ying, Liyuan Li, Mei Chee Leong, Nancy F. Chen)
16:20-16:55 Invited Talk 2: Cross-cultural studies on human-human and human-agent interaction (Yukiko I. Nakano)
16:55-17:25 Panel Discussion (Yukiko Nakano, Abhinav Dhall, Liu Zhengyuan, Shogo Okada)
17:25-17:30 Closing
Note:
Long paper: 15 min. for presentation + 5 min. for QA
Short paper: 10 min. for presentation + 5 min. for QA
The main contact address of the workshop is: ccmi-organizer@googlegroups.com
Mikey Elmers (Kyoto University, Japan)
Ryo Ishii (NTT, Japan)
Kristiina Jokinen (AIST, Japan)
Dimosthenis Kontogiorgos (Massachusetts Institute of Technology, USA)
Jauwairia Nasir (University of Augsburg, Germany)
Hiroki Tanaka (International Christian University, Japan)
Hung-Hsuan Huang (The University of Fukuchiyama, Japan)
Sixia Li (Japan Advanced Institute of Science and Technology, Japan)
Maha Elgarf (New York University, Abu Dhabi, UAE)
Shun Katada (Wakayama University, Japan)
Wenqing Wei (Japan Advanced Institute of Science and Technology, Japan)