IMUSIC: IMU-based Facial Expression Capture

Youjia Wang1,2    Yiwen Wu1,2  Ruiqian Li1 Hengan Zhou1,2  Hongyang Lin1,3

Yingwenqi Jiang1  Yingsheng Zhu1  Guanpeng Long1,4  Jingya Wang1  Lan Xu1  Jingyi Yu1

1ShanghaiTech University       2LumiAni Technology        3Deemos Technology        4ElanTech Co., Ltd

Abstract

For facial motion capture and analysis, the dominated solutions are generally based on visual cues, which cannot protect privacy and are vulnerable to occlusions. Inertial measurement units (IMUs) serve as potential rescues yet are mainly adopted for full-body motion capture. In this paper, we propose IMUSIC to fill the gap, a novel path for facial expression capture using purely IMU signals, significantly distant from previous visual solutions.The key design in our IMUSIC is a trilogy. We first design micro-IMUs to suit facial capture, companion with an anatomy-driven IMU placement scheme. Then, we contribute a novel IMU-ARKit dataset, which provides rich paired IMU/visual signals for diverse facial expressions and performances. Such unique multi-modality brings huge potential for future directions like IMU-based facial behavior analysis. Moreover, utilizing IMU-ARKit, we introduce a strong baseline approach to accurately predict facial blendshape parameters from purely IMU signals. Specifically, we tailor a Transformer diffusion model with a two-stage training strategy for this novel tracking task. The IMUSIC framework empowers us to perform accurate facial capture in scenarios where visual methods falter and simultaneously safeguard user privacy. We conduct extensive experiments about both the IMU configuration and technical components to validate the effectiveness of our IMUSIC approach. Notably, IMUSIC enables various potential and novel applications, i.e., privacy-protecting facial capture, hybrid capture against occlusions, or detecting minute facial movements that are often invisible through visual cues. We will release our dataset and implementations to enrich more possibilities of facial capture and analysis in our community.


Overview

We first introduce the hardware design and the data acquisition pipeline. Subsequently, we delve into the data calibration process and the methodology for facial motion recovery utilizing IMU signals. Following the deployment of IMUSIC, we demonstrate its effectiveness through various applications, underlining its precision and portability.

Results

Citation

@misc{wang2024imusic,      title={IMUSIC: IMU-based Facial Expression Capture},       author={Youjia Wang and Yiwen Wu and Ruiqian Li and Hengan Zhou and Hongyang Lin and Yingwenqi Jiang and Yingsheng Zhu and Guanpeng Long and Jingya Wang and Lan Xu and Jingyi Yu},      year={2024},      eprint={2402.03944},      archivePrefix={arXiv},      primaryClass={cs.CV}}