Point Cloud Upsampling for Accurate Surface Reconstruction via Attention-Guided Generation
Reconstructing accurate surfaces from sparse point clouds remains a fundamental challenge in 3D vision. In this paper, we propose a novel upsampling framework, called Surf-PU, that focuses on surface-aware point generation. Our method comprises two key components. First, we introduce an adaptive query generation module that analyzes local attention information to determine where new points should be placed. These query points guide the generation toward regions of geometric significance. Second, we incorporate the Hyper Chamfer Distance (HCD) loss, which accounts for the distribution of point-wise distances to better capture complex surface structures. Unlike existing methods, our approach effectively combines attention-based guidance and loss-driven precision, leading to more accurate surface reconstruction. Extensive experiments on PU-GAN and PU1K datasets demonstrate that our method consistently outperforms state-of-the-art techniques, especially in terms of the P2F metric. Moreover, the proposed framework maintains robust performance across arbitrary upsampling ratios and under random noise, confirming its reliability and generalizability.
Overall architecture of the proposed point cloud upsampling framework. The network takes an input point cloud and its features, and extracts local attention scores using a vector attention mechanism. These scores guide the Adaptive KP-Queries Generation Module to produce kernel point queries in geometrically important regions. The kernel point query features are then combined with learnable weights and passed to the RepKPoints Extraction Module, which encodes kernel point representations. The Multi-head Cross Attention and Displacement Vectors Regression Modules generate the upsampled point cloud.
Qualitative comparison of 4x upsampled point clouds on the PU-GAN dataset. Red boxes indicate zoomed-in regions.
Surface mesh reconstruction results from 16× upsampled point clouds on the PU-GAN dataset. Red boxes indicate zoomed-in point cloud regions.
Mesh reconstruction results on unseen shapes from the ShapeNetCore dataset using 16× upsampled point clouds.
Image Editing using Self- and Cross-Attions based on Diffusion Models