Skeleton2Mesh: Kinematics Prior Injected Unsupervised Human Mesh Recovery

ABSTRACT

In this paper, we decouple unsupervised human mesh recovery into the well-studied problems of unsupervised 3D pose estimation, and human mesh recovery from estimated 3D skeletons, focusing on the latter task. The challenges of the latter task are two folds: (1) pose failure (i.e., pose mismatching -- the gap between 3D pose estimation and human mesh recovery, and pose ambiguity -- the lack of kinematics constraints on endpoint joints, (2) shape ambiguity (i.e., the lack of shape constraints on body configuration). To address these issues (i.e., pose mismatching, pose ambiguity and shape ambiguity), we propose Skeleton2Mesh, a novel lightweight framework that recovers human mesh from a single image. Our Skeleton2Mesh contains three modules corresponding to above issues, i.e., Differentiable Inverse Kinematics (DIK), Pose Refinement (PR) and Shape Refinement (SR) modules. DIK is designed to transfer 3D rotation from estimated 3D skeletons, which relies on a minimal set of kinematics prior knowledge (i.e., skeletal joint connectivity information). Then PR and SR modules are utilized to tackle the pose ambiguity and shape ambiguity respectively. All three modules can be incorporated into Skeleton2Mesh seamlessly via an end-to-end manner. Furthermore, we utilize an adaptive joint regressor to alleviate the effects of skeletal topology from different datasets. Results on the Human3.6M dataset for human mesh recovery demonstrate that our method improves upon the previous unsupervised methods by 32.6$\%$ under the same setting. Qualitative results on in-the-wild datasets exhibit that the recovered 3D meshes are natural, realistic.

INSIGHTS


Ill-posed problem: Skeleton2Mesh consists of pose failure and shape ambiguity. It can be easily seen that mesh in the left and mesh in the right correspond to the identical skeleton.

pose failure indicates that Skeleton2Mesh fails to determine the hand orientation due to the lack of the corresponding hand joint.

shape ambiguity inidcates that human with different body configurations (i.e., fat or thin) possibly corresponds to the same skeleton.


The left is multi-rigid-body system, whose units are rigid bodies including connect unit and leaf unit. Intuitively, connect unit has parent node and child node, and leaf unit only has parent node. The right is process of DIK module taking right elbow as example.

Skeleton to SMPL model with the mean shape. The input of the demo is 3D skeleton, then output the corresponding SMPL with the mean shape.

METHOD

Detailed architecture of Skeleton2Mesh framework.

EXPERIMENTAL RESULTS

Human3.6M

Human3.6M

Human3.6M

Human3.6M

MPI-INF-3DHP

Surreal

Surreal

Surreal

Surreal

Surreal

Generalization Applications

  1. Robot Arm

Skeleton to robot arm via DIK module. The input of this demo is the right hand from 3D skeleton, and then output the corresponding robot arm (cyton) with seven degrees of freedom.

2. Animal

Skeleton to animals (e.g., SMAL) via DIK module.

3. Cartoon character

Skeleton to cartoon characters via DIK module.