Email: gaozhongpai@gmail.com

Location: Cambridge, MA 02140, USA

I am currently an Expert Research Scientist at UII America. I finished my postdoctoral research with Prof Xiaokang Yang in 2021 and PhD study under the supervision of Prof. Guangtao Zhai in 2018 at Shanghai Jiao Tong University. During my PhD, I did my joint PhD study at Schepens Eye Research Institute, Harvard Medical School, under the supervision of Prof. Eli Peli.

My research focuses on 3D computer vision and AR/VR. I am interested in 3D reconstruction, NeRF/3DGS, and video understanding. I collaborate with several professors, including Prof Junchi Yan, Prof Juyong Zhang, and Prof Menghan Hu.

Previously, I worked on visually induced motion sickness (VIMS) in stereoscopic 3D and head-mounted displays. I also worked on psychovisual modulation technology with applications of dual-view display, invisible QR Code, and so on.

Openings

We have several openings (internship & full-time). More details and application here.

News

07-01-2024: Our paper "Divide and Fuse: Body Part Mesh Recovery from Partially Visible Human Images" is accepted by ECCV 2024 @ Milan
06-28-2024: Our paper "Hidden Barcode in Sub-Images with Invisible Locating Marker " is accepted by TOMM
05-13-2024: Our paper "Few-Shot 3D Volumetric Segmentation with Multi-Surrogate Fusion" is early accepted by MICCAI 2024 @ Morocco
04-16-2024: Our paper "Towards Universal Training-Free Coverless Image Steganography with Diffusion Models" is accepted by IJCAI 2024 @ Jeju
02-26-2024: Our paper "DaReNeRF: Direction-aware Representation for Dynamic Scenes" is accepted by CVPR 2024 @ Seattle
01-16-2024: Our paper "PBADet: A One-Stage Anchor-Free Approach for Part-Body Association" is accepted by ICLR 2024 @ Vienna
12-09-2023: Our two papers are accepted by AAAI 2024 @ Vancouver
- Disguise Without Disruption: Utility-Preserving Face De-Identification
- Implicit Modeling of Non-Rigid Objects With Cross-Category Signals

08-10-2023: Our paper "Synergetic Assessment of Quality and Aesthetic: Approach and Comprehensive Benchmark Dataset" is accepted by TCSVT
06-22-2023: Our demo "Real-time 3D Hand Gestures for Medical Image Visualization" is presented on CVPR 2023 @ Vancouver
11-14-2022: Our paper "RIVIE: Robust Inherent Video Information Embedding" is accepted by TMM

09-11-2022: Our paper "Learning Continuous Mesh Representation with Spherical Implicit Surface" is accepted by FG 2023 @ Waikoloa, Hawaii

03-01-2022: Our paper "Learning Invisible Markers for Hidden Codes in Offline-to-online Photography" is accepted by CVPR 2022 @ New Orleans
02-09-2022: Our paper "Robust mesh representation learning via efficient local structure-aware anisotropic convolution" is accepted by TNNLS

Honor & Awards

2020 Best Paper Award on CVPR Workshop (Dynavis)
2019 China Initiative Postdocs Fellowship ("博士后创新人才支持计划")
2019 Shanghai Super Postdocs Fellowship (“超级博士后”激励计划")
2019 National Youth Fund by NSFC

DEMO

Real-time 3D Hand Gestures for Medical Image Visualization (CVPR 2023 Demo)

This innovative approach can transform medical imaging, enhancing diagnostic accuracy, treatment planning, and medical education.

Real-time 3D hand reconstruction from RGB camera on mobile phones in AR environments.

Support for left/right and multiple hands. Here only visualize 3D landmarks

Dynamic facial avatar with implicit neural Fields

High fidelity for eye movements

Semi-supervised 3d face representation learning from unconstrained photo collections (CVPRW 2020, Best paper award, PDF)

The identity shapes are consistent over the frames with different lighting, poses, and expression conditions.

Dual-view display based on spatial psychovisual modulation (Project, PDF)

We provide two different views concurrently for different users on a single medium.

PUBLICATIONS

Zhongpai Gao, "Learning Continuous Mesh Representation with Spherical Implicit Surface", FG 2023, Code, PDF, IEEE

For meshes with fixed topology, we learn spherical implicit surface (SIS), which takes a spherical coordinate and the local vertex features around the coordinate or the global feature of the 3D shape as inputs, and predicts the 3D position at a given coordinate as an output. Since the spherical coordinates are continuous, SIS can present a mesh in an arbitrary resolution.

Jun Jia*, Zhongpai Gao*, Dandan Zhu, Xiongkuo Min, Guangtao Zhai, and Xiaokang Yang, "Learning Invisible Markers for Hidden Codes in Offline-to-online Photography", in IEEE Computer Vision and Pattern Recognition, CVPR 2022, PDF

This paper proposes a novel invisible information hiding architecture for display/print-camera scenarios, consisting of hiding, locating, correcting, and recovery, where invisible markers are learned to make hidden codes truly invisible.

Yunhao Li, Wei Shen, Zhongpai Gao, Yucheng Zhu, Guangtao Zhai, Guodong Guo "Looking Here or There? Gaze Following in 360-Degree Images", Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2021), PDF

We propose a 3D sight line guided dual-pathway framework, to detect the gaze target within a local region (here) and from a distant region (there), parallelly. Specifically, the local region is obtained as a 2D cone-shaped field along the 2D projection of the sight line starting at the human subject's head position, and the distant region is obtained by searching along the sight line in 3D sphere space.

Zhongpai Gao, Junchi Yan, Guangtao Zhai, and Xiaokang Yang. "Learning Spectral Dictionary for Local Representation of Mesh", Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI 2021), PDF

We learn spectral dictionary (i.e., bases) for the weighting matrices such that the parameter size is independent of the resolution of 3D shapes. The coefficients of the weighting matrix bases for each vertex are learned from the spectral features of the template's vertex and its neighbors in a weight-sharing manner.

Zhongpai Gao, Junchi Yan, Guangtao Zhai, Juyong Zhang, Yiyan Yang, and Xiaokang Yang. "Learning local neighboring structure for robust 3D shape representation", Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2021), PDF

We propose a local structure-aware anisotropic convolutional operation (LSA-Conv) that learns adaptive weighting matrices for each node according to the local neighboring structure and performs shared anisotropic filters. In fact, the learnable weighting matrix is similar to the attention matrix in the random synthesizer.

Zhongpai Gao, Juyong Zhang, Yudong Guo, Chao Ma, Guangtao Zhai, Xiaokang Yang. "Semi-supervised 3d face representation learning from unconstrained photo collections", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2020, Best paper award), PDF

We train our model in a semi-supervised manner with adversarial loss to exploit large amounts of unconstrained facial images. A novel center loss is introduced to make sure that facial images from the same subject have the same identity shape and albedo. Besides, our proposed model disentangles identity, expression, pose, and lighting representations, which improves the overall reconstruction performance and facilitates facial editing applications, eg, expression transfer.

Jun Jia, Zhongpai Gao, Kang Chen, Menghan Hu, Xiongkuo Min, Guangtao Zhai, Xiaokang Yang. "RIHOOP: Robust Invisible Hyperlinks in Offline and Online Photographs", IEEE Transactions on Cybernetics (IEEE TCyb 2020), PDF

Our approach is an end-to-end neural network with an encoder to hide messages and a decoder to extract messages. To maintain the hidden message resilient to cameras, we build a distortion network between the encoder and the decoder to augment the encoded images. The distortion network uses differentiable 3-D rendering operations, which can simulate the distortion introduced by camera imaging in both printing and display scenarios.

PRE-PRINT

Zhongpai Gao, Junchi Yan, Guangtao Zhai, and Xiaokang Yang. "Permutation Matters: Anisotropic Convolutional Layer for Learning on Point Clouds", arXiv preprint arXiv:2005.13135

We propose a permutable anisotropic convolutional operation (PAI-Conv) that calculates soft-permutation matrices for each point using dot-product attention according to a set of evenly distributed kernel points on a sphere's surface and performs shared anisotropic filters. In fact, dot product with kernel points is by analogy with the dot-product with keys in Transformer

DEMO

PUBLICATIONS

PRE-PRINT

MORE PROJECTS (HERE)