1st Workshop on
Crossmodal Social Animation

The XS-Anim workshop is organized in conjunction with

ICCV 2021 - International Conference on Computer Vision, Montreal, Canada, October 11th - October 17th, 2021

The workshop will be held virtually

**New**

Find the video recording of the workshop here: video

Overview

Computer vision has seen an extraordinary level of innovation in the past decade, with the advent of deep neural architectures. This progress was particularly impressive in the domain of generative modeling, such as automatic generation of images. The next frontier for generative computer vision models is to also model human motion, and its close relationship with language and speech. Generating videos is already a big technical challenge, but when it comes to animating human motion, the level of complexity and minutiae is particularly high given our years of experience interacting with other people. This animation generation challenge becomes even more interesting given its multimodal nature: human motion is a communicative channel linked to other modalities such as language and speech. It is a timely opportunity to study this topic of crossmodal social animation. The study of crossmodal social animation will require both data-driven empirical modeling and the integration of social and communication theories. It also requires a multidisciplinary approach, inviting researchers from computer vision, computer graphics, social robotics, virtual reality and human social communication. This workshop is a unique occasion to study crossmodal factors involved in naturalistic and engaging body motion generation.

We are in the middle of a revolution for artificial intelligence where new technologies are becoming more interactive (e.g., virtual assistant such Alexa, Google Assistant, Siri, Cortina). The next generation of these interactive technologies is likely to present some embodiment, such as virtual character or social robot. It is essential to better understand the link between human body motion and other communicative channels such as language and speech. This research has the potential to enable more realistic and engaging social interactions, which are central in sharing knowledge, ideas and important parts of successful collaborations and teamwork. Furthermore, it also encourages the study of effective communication by intelligent tutoring systems in classroom settings, empathy in clinical psychology and tools to aid animation generation. It is also a key building block for forging new relationships through self-expression as well as understanding others' emotions and thoughts.

Invited Speakers

Yaser Sheikh

Facebook Reality Labs Carnegie Mellon University, USA

Maja Matarić

University of Southern California, USA

Stacy Marsella

Northeastern University, USA

Hae Won Park

MIT Media Lab MIT, USA

Richard Bowden

University of Surrey, UK

Workshop Schedule

Date: October 16, 2021

12:00 pm - 12:15 pm Introduction and Opening Remarks (Chaitanya Ahuja)

Invited Speakers - Session 1

12:15 pm - 01:05 pm Yaser Sheikh (video)

Telepresence with codec avatars

01:05 pm - 01:55 pm Richard Bowden (video)

Towards Computational Sign Language Translation

Spotlight Talks (video)

02:00 pm - 02:15 pm Shyam Krishna, Vijay Vignesh P, Dinesh Babu J

SignPose: Sign Language Animation Through 3D Pose Lifting

02:15 pm - 02:30 pm Xiaopeng Lu, Zhen Fan, Yansen Wang, Jean Oh, Carolyn Rosé

Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling

02:30 pm - 02:45 pm Jonathan Windle, Sarah Taylor, David Greenwood, Iain Matthews

Motion Symmetry in Conversation

02:45 pm - 03:00 pm Jón Helgason, Johann Skulason, Anna Islind, Steinunn Sigurðardóttir, Hannes Vilhjálmsson

Integrating Video with Artificial Gesture

Invited Speakers - Session 2

03:00 pm - 03:50 pm Maja Matarić (video)

Multimodal Human-Robot Interaction: Understanding, Engaging, and Supporting Each User

03:50 pm - 04:40 pm Hae Won Park (video)

Long-term Relational Agents - Design and Impact

04:40 pm - 05:30 pm Stacy Marsella (video)

Gestures: Some thoughts on form, function and aesthetics

*All times in EST

Call for Papers

Topics for submission include (but are not limited to):

(1) Generative animation models

Generative Modeling of human body motion
Facial motion and facial expression generation and representation
Body (including hands, arms, head, eye-gaze) shape and motion representation
Multi-party social interactions and generative models

(2) Vision-Language-Speech Grounding

Natural language grounding with human body motion

Grounding of speech and acoustics signals
Co-speech gesture grounding modeling
Multimodal and multi-party grounding

(3) Gesture and Animation Styles

Style content disentanglement of grounded body motion
Style transfer for grounded body motion

(4) Privacy and ethical issues

Detecting biases in generated animations
Detecting Fake Animations
Ramifications of socially adept virtual agents/robots in societies

(5) Data Efficiency and Resources

Few-shot generative modeling and domain transfer of animation models
Semi-supervised or self-supervised generative animation modeling
Body motion corpora, including diverse speakers, styles and topics
Semi-automatic corpora annotation tools

(6) Application domains

Embodied agents, including robot and virtual humans
Social Interaction in Virtual and Augmented Reality
Sign-Language generation
Locomotion modeling and animation
Rhythmic body motion animation (e.g., dance)

Submission Guidelines

The format for paper submission is the same as the ICCV 2021 submission format. Papers that violates anonymity or do not use the ICCV submission template will be rejected without review. Papers will be selected based on relevance, significance and novelty of results, technical merit, and clarity of presentation. In submitting a manuscript to this workshop, the authors acknowledge that no paper substantially similar in content has been submitted to another workshop or conference during the review period.

Main Track*

8 pages (excluding references)

*Accepted papers will appear in the proceedings of ICCV 2021 workshops

Late Breaking Results

4 pages (excluding references)

Submission Portal

Submission Format