GFPose: Learning 3D Human Pose Prior with Gradient Fields

Hai Ci1           Mingdong Wu1,2            Wentao Zhu2            Xiaoxuan Ma2            Hao Dong2            Fangwei Zhong1,2            Yizhou Wang2

1.  Beijing Institute for General Artificial Intelligence           2. Peking University

GFPose is a score-based 3D human pose prior model that can be easily used for various applications, e.g., 3D human pose estimation, pose denoising and generation. Our key idea is to estimate the gradient field (a.k.a, score) of the perturbed human pose. Scores encode "what a reasonable pose looks like."  We can leverage them to adjust poses to be more plausible and feasible to a task specification. 

Multi-hypothesis 3D Human Pose Estimation

web_hpe.mp4

Qualitative comparison with cVAE

We compare the hypotheses sampled from GFPose and cVAE[1].  Preds and GTs are colored yellow and white, respectively. 

Both methods work well with the easier case (3rd row). However, GFPose performs better with the harder ones (first two rows). It gives more diverse hypotheses while keeping faithful to GT. 


[1] Saurabh Sharma, et al. Monocular 3d human pose estimation by generation and ordinal ranking. In ICCV, 2019 

Reconstruction from Occluded 2D Observation

web_completion2D.mp4

Reconstruction from Partial 3D Observation

web_completion3D.mp4

Pose Denoising

web_denoising.mp4

Pose Generation

web_generation.mp4

Citation

@article{ci2022gfpose,

  title = {GFPose: Learning 3D Human Pose Prior with Gradient Fields},

  author = {Ci, Hai and Wu, Mingdong and Zhu, Wentao and Ma, Xiaoxuan and Dong, Hao and Zhong, Fangwei and Wang, Yizhou},

  journal = {arXiv preprint arXiv:2212.08641},

  year = {2022}

}