We are excited to have the following keynote talks at our workshop
Abstract: Dr. Kerbl will present his ongoing research on 3D Gaussian Splatting and its extensions. In just over two years since its introduction, 3D Gaussian Splatting has become an important representation for 3D radiance fields. The fast training and inference speed enable quicker iteration on solutions, facilitating research and downstream applications. In order to function on next-generation portable platforms (mobile phones, head-mounted displays), it is important to minimize model complexity and maximize processing efficiency. The next frontier - reconstruction of radiance fields in real-time - would enable a range of attractive use cases in domains as varied as entertainment, modern medicine, and crisis management. In this talk, Dr. Kerbl will outline recent advancements in high-performance inference and training, and the trajectory towards feasible real-time reconstruction. Dr. Kerbl will present his work and open challenges for shrinking the fundamental 3D Gaussian representation, increasing 3DGS robustness, scaling training and rendering to support scenes at city scale, and accelerating optimization to yield high-quality reconstructions within minutes.
Speaker bio: Dr. Bernhard Kerbl is Co-Principal Investigator at TU Wien for the project on Instant Visualization and Interaction for Large Point Clouds (IVILPC). Before, he was a Visiting Researcher at the Robotics Institute, Carnegie Mellon University, in the Human Sensing Lab. His research focuses on real-time graphics, parallel processing, point-based rendering, image-based rendering, radiance fields and novel-view synthesis. Dr. Kerbl obtained his PhD at Graz University of Technology in 2018. In 2019, he briefly joined Epic Games to work on Unreal Engine 5’s Nanite, followed by a postdoc phase at TU Wien and INRIA in George Drettakis’ GraphDeco group. While at INRIA, Dr. Kerbl was first author on the seminal paper on 3D Gaussian Splatting, which has been cited over 5000 times since it's publication in August 2023. Dr. Kerbl has lectured on the topics of GPU programming, real-time rendering, physically-based rendering, game physics and scientific working at various Austrian universities.
Abstract: Egocentric perception is a cornerstone for human-centric contextual AI and robotics, yet directly applying current 3D/4D Gaussian Splatting (GS) methods to head-mounted inputs often yields unstable quality and inadequate speed. In this talk, I will introduce Project Aria, the leading platform for egocentric perception, and outline the core challenges such as rapid viewpoint changes, motion blur, rolling shutter artifacts, and strict compute and latency constraints. I will then present practical “recipes” from several recent research works we performed at Meta Reality Labs that enable photorealistic, real-time 3D/4D GS reconstruction of fully dynamic scenes using the egocentric perception stack, exemplified by Aria glasses. Finally, I will discuss the open challenges that remain and outline potential applications of egocentric 3D/4D GS reconstructions for advancing agentic AI.
Speaker bio: Zhao Dong is a Senior Research Scientist Manager at Meta Reality Labs Research, where he leads a team advancing various aspects of Spatial AI. His work focuses on areas such as 3D digital twin creation of objects and scene, inverse rendering/3D reconstruction, and generative models for 3D. Zhao's efforts are geared towards building the next-generation human-centric computing platform for AR/VR/Metaverse.
Abstract: Egocentric In this talk we investigate three different ways we can use data, deep learning and GenAI to improve Gaussian Splating results. First, in FlowR, we show how to train Flow-Matching models to learn the transform from “Renders of Noisy 3DGS Reconstructions” to “Clean Images”, which is super effective in cleaning up noisy GS reconstructions. Secondly, in BulletGen, we show how to use pre-trained generative models to improve monocular dynamic gaussians splatting results for dynamic moving scenes. This approach uses Generative AI on ‘bullet-frames’ to fill in holes and reduce floaters across all timesteps. Finally, in MapAnything, we present a method that can produce dense 3D point clouds directly from raw images (no COLMAP needed), while also being able to take advantage of any potential inputs one does have available (e.g. camera intrinsics, sparse depth, etc.). This approach sets a new standard for a variety of 3D tasks from monocular depth estimation to multi-view stereo and structure from motion; and can also easily be extended to improving Gaussian Splatting results.
Speaker bio: Jonathon Luiten is a Research Scientist in Meta Reality Labs leading Meta's efforts on Dynamic 3D Reconstruction for creating 3D video content and enabling the future of immersive entertainment. Jonathon's research touches on Gaussian Splatting, Efficient Rendering, 3D Tracking, Virtual Reality and SLAM. He obtained his PhD from the RWTH Aachen University under the supervision of Bastian Leibe, while also visiting Carnegie Melon University to work with Deva Ramanan, and the University of Oxford to work with Philip Torr. He previously has a large body of work focusing on 2D tracking methods, benchmarks and evaluation metrics before moving the focus of his research to dynamic 3D scenes.