BASED Website

BASED: Bundle-Adjusting Surgical Endoscopic Dynamic Video Reconstruction using Neural Radiance Fields

Shreya Saha, Sainan Liu, Shan Lin, Jingpei Lu, and Michael Yip

University of California San Diego / Intel Labs

Abstract

Reconstruction of deformable scenes from endoscopic videos is important for many applications such as intraoperative navigation, surgical visual perception, and robotic surgery. It is a foundational requirement for realizing autonomous robotic interventions for minimally invasive surgery. However, previous approaches in this domain have been limited by their modular nature and are confined to specific camera and scene settings. Our work adopts the Neural Radiance Fields (NeRF) approach to learning 3D implicit representations of scenes that are \textit{both dynamic and deformable} over time, and furthermore with unknown camera poses. We demonstrate this approach on endoscopic surgical scenes from robotic surgery. This work removes the constraints of known camera poses and overcomes the drawbacks of the state-of-the-art unstructured dynamic scene reconstruction technique, which relies on the static part of the scene for accurate reconstruction. Through several experimental datasets, we demonstrate the versatility of our proposed model to adapt to diverse camera and scene settings, and show its promise for both current and future robotic surgical systems.

BASED Architecture

The overall method combines a learnable pose layer that parameterizes camera poses, a deformation module that learns the deformation of a 3D point at a given time step with respect to a canonical position and a rendering module that takes the canonical coordinates of a 3D point at every time step along with the 2D camera directions and outputs the volumetric density and color information. Losses used are (1) Photometric Loss, (2) Reference Depth Loss and a (3) novel Dynamic Multi-View Correspondence Loss.

BASED Results

Original Video

Video Reconstruction

Original Video

Video Reconstruction

Original Video

Video Reconstruction

Original Video

Background Completion

Original Video

Viewpoint Reconstruction at any static time

For more quantitative results, please refer to the paper. The additional results will be added to the supplementary section soon. The codebase will soon be made public.

Page updated

Report abuse