Gaussian Splatting
Gaussian splatting is a technique for representing and rendering 3D scenes using collections of 3D Gaussians (ellipsoidal blobs) positioned and scaled in space. Each Gaussian models a small patch of the scene, allowing for efficient and photorealistic view synthesis from input data. This method enables smooth, continuous reconstructions, even with sparse or noisy measurements. The main objectives of this project are to understand how Gaussian splatting works, implement scene reconstruction with Gaussians, and visualize the results.
3D Gaussian Rasterization
Implemented a PyTorch pipeline to project 3D Gaussians to 2D, evaluate their contributions, filter and depth-sort them, compute alpha/transmittance for visibility, and blend to produce color, depth, and silhouette images. Core functions are in model.py and rendering is managed in render.py.
Project 3D Gaussians to 2D
Transformed 3D Gaussian parameters (means and covariances) into their 2D Gaussian equivalents on the camera's image plane. Using the original paper's implementation, camera projection matrices and approximation equations were used to extract 2D means and covariance matrices for rasterization.
Evaluate 2D Gaussians
Computed the exponent term (power) of the 2D Gaussian function at pixel coordinates. This power term influences the transparency and blending of the Gaussian splats on the 2D image.
Filter and Sort Gaussians
Gaussians behind the camera were discarded and sorted the remaining Gaussians by depth to maintain correct rendering order during the splatting operation. This is implemented in the Scene class of model.py.
Compute Alphas and Transmittance
Computed alpha values indicative of Gaussian opacity at each pixel and transmittance values to account for occlusion among Gaussians using ordered alpha blending. Functions compute_alphas and compute_transmittance in model.py perform these computations efficiently.
Perform Splatting
Blended all Gaussians’ color contributions at each pixel using computed alpha and transmittance values to yield the RGB image, depth, and silhouette masks. The function splat implements this blending, and results are rendered through render.py which is shown below as a GIF:
Render of Chair using Gaussian Splatting
Training 3D Gaussian Representations
Optimized Gaussian parameters to represent custom scenes (blue tractor) from multi-view data. The implementation is primarily in train.py, where Gaussian parameters are made trainable by enabling gradient computations. Additionally, an optimizer is set up by assigning different learning rates to means, colors, and opacities to ensure stable training. The training loop renders each view using the Gaussian rasterizer we created previously, computes a reconstruction loss against ground truth images, and backpropagates gradients to optimize the Gaussian parameters. The result of the training process are Gaussians with high PSNR/SSIM, shown as a GIF:
Final Rendered GIF
Training Process GIF
Extensions
Gaussian splatting is extended to improve rendering quality and handle more complex scenarios using two key enhancements, implemented mainly in model.py and supporting files.
Rendering Using Spherical Harmonics
View-dependent appearance effects such as reflections, are modelled using spherical harmonic lighting components. The original model.py and data_utils.py are modified to use spherical harmonic coefficients alongside Gaussian colors. The function colours_from_spherical_harmonics computes the final color given lighting directions and spherical harmonic coefficients, resulting in more realistic renders as shownin the GIF:
Rendered GIF using Spherical Harmonics
Training on a Harder Scene
Training is extended to complex scenes with randomly initialized Gaussians to emulate noisy data conditions. Code in train_harder_scene.py adapts training procedures with techniques like anisotropic Gaussians, learning rate scheduling, and adaptive density control to improve convergence and detail capture compared to the simpler isotropic Gaussian case.
Conclusion
In this project, a 3D Gaussian rasterizer was built to convert 3D scene representations into 2D images, Gaussian parameters were trained from multi-view data for accurate scene reconstruction, and extensions were added to improve rendering quality and handle complex scenes. This work demonstrates an efficient and effective approach for photorealistic 3D rendering and reconstruction.