ICMI 2025 Workshop

Cross-Cultural Multimodal Interaction (CCMI) 2025

This ICMI 2025 workshop seeks to establish an international research platform to investigate the impact of linguistic and cultural differences on nonverbal behavior and their effects on communication dynamics. Moving beyond merely identifying nonverbal behavior patterns in specific cultural contexts, the workshop aims to uncover the mechanisms behind adaptation, change, and misunderstanding in intercultural interactions. The first year will focus on data-related challenges, such as collecting and annotating high-quality data across different regions. While advances in sensor technology, machine learning, and Large Language Models (LLMs) have been applied to linguistic diversity, their use in nonverbal communication remains underexplored. Given the known cultural variations in gestures, facial expressions, and turn-taking, integrating insights from humanities research with multimodal analysis is crucial. As LLMs continue to shape human-machine interactions globally, understanding and incorporating cultural differences in nonverbal behavior is an urgent and significant research challenge.

Thank you very much for attending our workshop!

We arere delighted to share that our workshop was a great success, thanks to the valuable contributions of our invited speakers, presenters, and all participants.

Otucomes:

Workshop Summary Paper
https://dl.acm.org/doi/10.1145/3716553.3759264
Workshop Proceedings
https://dl.acm.org/doi/proceedings/10.1145/3747327#heading4

Updates

The workshop has sucessfully ended. Thank you! 👏(13th October 2025)
We have sent out the online meeting room information to the online presenters. If you have not received it, please get in touch with the organizers (10th October 2025)
The workshop program has been released 👍 (10th September 2025)
We have extended the paper deadline until July 12th (9th July 2025)
Another invited talk has been determined, Yukiko I. Nanako from Seikei University 🎉 (4th, July 2025)
Submittion instructions have been updated ✍️ (23rd, May 2025)
One of the invited talks has been determined, Abhinav Dhall from Monash University 🎉(14th, May 2025)
The workshop has been accepted for ICMI 2025! See you in Canberra🙋 (10th, March 2025)

Topics of Interest

It includes (but is not limited to) the following:

Cross-cultural differences in nonverbal communication
Data collection methodologies for intercultural interaction
Annotation schema for multimodal data
Cultural variations in gestures, facial expressions, back-channeling, and turn-taking
Integration of humanities research insights into multimodal analysis
Applications of Large Language Models (LLMs) to nonverbal communication
Multimodal machine learning and signal processing methods for detecting cultural differences
Cross-linguistic differences and universals in turn-taking behavior
Technological innovations supporting intercultural human-machine interaction
Interdisciplinary research bridging humanities, social sciences, and multimodal computing
Standardized protocols for intercultural multimodal data collection
Sharing and harmonizing multimodal datasets and analysis techniques across international research groups

Important Dates (Tentative)

Paper Submission July 7th 12th, 2025
Paper Notification August 7th, 2025
Camera Ready August 24th, 2025
Workshop Day October 13th, 2025

Submission Instructions

We invite both long papers (up to 8 pages) and short papers (up to 4 pages), all formatted in the double-column ACM conference style used by ICMI 2025. Pages consisting solely of references do not count toward the page limit for either paper type. All submissions must be prepared for double-blind review and submitted as PDF files via OpenReview.

Submission Guidelines for Authors (ICMI 2025):

https://icmi.acm.org/2025/guidelines/

Submission Site:

https://openreview.net/group?id=ACM.org/ICMI/2025/Workshop/CCMI

You will need an OpenReview account, and we strongly recommend creating one as soon as possible if you haven't already.

Please note the following:

New profiles created without an institutional email will go through a moderation process that can take up to two weeks.
New profiles created with an institutional email will be activated automatically.

Invited Talks

Yukiko I. Nakano (Seikei University)

Cross-cultural studies on human-human and human-agent interaction

Cultural diversity in communication has been studied across various disciplines, including communication science, linguistics, and social science. However, to conduct cross-cultural studies of multimodal interaction, simply collecting diverse communication data is not enough, but a careful design of data collection is required. In this talk, I will introduce methods for collecting cross-culturally comparable data and present findings that reveal cultural differences in human communication. In the context of human-agent interaction, understanding cultural differences is important for designing culturally adaptive and socially acceptable agents (e.g., virtual agents, robots). I will also showcase how people from different cultural backgrounds perceive agent behaviors. Finally, I will discuss how theories and definitions of culture can serve as a foundation for research on cross-cultural multimodal interaction.

Abhinav Dhall (Monash University)

Multimodal Deepfake Detection Across Cultures and Languages

The growing accessibility of Generative AI based image and video manipulation tools has made the creation of deepfakes easier. This poses significant challenges for content verification and can spread misinformation. In this talk, we explore multimodal approaches that are inspired from user behavior for detecting and localizing manipulations in time. A key focus of our work is on multilingual and multicultural aspects of deepfake detection. Our research draws on user studies, including those focusing on multicultural deepfakes, which provide insights into how different audiences perceive and interact with manipulated media. We discuss the findings from the ACM Multimedia 2025 One Million Deepfakes Detection benchmark. These insights give directions for future works in the area of deepfakes analysis in globally diverse contexts.

Program

October 13th

13:15-13:20 Opening and Welcome
13:20-13:55 Session 1 (Chair: Koji Inoue)
- - 13:20-13:40 Benchmarking Visual Generative Models through Cultural Lens: A Case Study with Singapore-Centric Multi-Cultural Context (Ali Koksal, Loke Mei Hwan, Hui Li Tan, Nancy F. Chen )
  - 13:40-13:55 Culture-Aware Multimodal Personality Prediction using Audio, Pose, and Cultural Embeddings (Islam J A M Samiul, Khalid ZAMAN, Marius Funk, Masashi Unoki, Yukiko Nakano, Shogo Okada)
13:55-14:30 Invited Talk 1: Multimodal Deepfake Detection Across Cultures and Languages (Abhinav Dhall)
14:30-15:00 Break
15:00-16:10 Session 2 (Chair: Shogo Okada)
- - 15:00-15:20 Multimodal grounding in HRI using two types of nods in Japanese and Finnish (Taiga Mori, Kristiina Jokinen, Leo Huovinen, Biju Thankachan)
  - 15:20-15:40 Analyzing Multimodal Multifunctional Interactions in Multiparty Conversations via Functional Spectrum Factorization (Momoka Tajima, Issa Tamura, Kazuhiro Otsuka)
  - 15:40-15:55 MultiGen: Child-Friendly Multilingual Speech Generator with LLMs (Xiaoxue Gao, Huayun Zhang, Nancy F. Chen)
  - 15:55-16:10 Contextualized Visual Storytelling for Conversational Chatbot in Education (Hui Li Tan, Gu Ying, Liyuan Li, Mei Chee Leong, Nancy F. Chen)
- 16:20-16:55 Invited Talk 2: Cross-cultural studies on human-human and human-agent interaction (Yukiko I. Nakano)
- 16:55-17:25 Panel Discussion (Yukiko Nakano, Abhinav Dhall, Liu Zhengyuan, Shogo Okada)
- 17:25-17:30 Closing