The DREAMS (Distributed Robotic Exploration And Mapping Systems) laboratory focuses in the development of robotics systems and digital twins for the research of mapping systems. The main goal is to develop autonomy pipelines that allow robotic systems to explore an environment, collect useful visual data, and reconstruct accurate 3D models of objects or terrain without requiring fully manual image capture.
This work combines robotics simulation, computer vision, photogrammetry, 3D reconstruction, and next-best-view planning. I use software-in-the-loop simulation environments to test autonomous mission workflows before they are transferred to real robotic systems.
The core stack includes ROS2, PX4 SITL, Gazebo, Python, Blender, COLMAP, Metashape, CloudCompare, and 3D Gaussian Splatting.
The simulation environment is built as a controlled virtual testing ground for autonomous vehicles and active 3D reconstruction. Gazebo is used as the physics-based simulation environment, while PX4 provides the autonomy and vehicle-control layer. ROS2 connects the different components of the system, including mission scripts, camera control, scan-pattern execution, and data collection.
The terrain is generated from digital elevation data and edited in Blender to create a usable simulation world. Custom assets, such as rocks or terrain features, are added to provide enough visual texture for photogrammetry and 3D reconstruction. These environments allow me to test how an autonomous robot captures images, follows scan patterns, and gathers data for reconstruction.
Ubuntu - OS
Gazebo - Physics and virtual world
ROS2 - Middleware and custom nodes via Python
PX4 - Vehicle autonomy and flight controler
Blender - Terrain model generation and mesh editing
Metashape / COLMAP: photogrammetry and structure-from-motion
3D Gaussian Splatting (3DGS): Novel View Synthesis
CloudCompare - Point cloud comparison and geometric error analysis
The workflow begins by creating a virtual environment containing a terrain and target object. A robotic vehicle then captures images from different viewpoints using predefined or adaptive scan patterns. These images are processed through photogrammetry and Structure-from-Motion pipelines to generate sparse and dense reconstructions.
After reconstruction, the generated model is compared against a known ground truth. This makes it possible to evaluate the quality of the reconstruction using geometric metrics instead of relying only on visual appearance.
The general workflow is:
Build or import a simulation environment.
Add a target object with enough geometric and visual detail.
Run autonomous image-capture missions in Gazebo/PX4.
Process captured images using COLMAP, Metashape, or 3D Gaussian Splatting.
Compare the reconstructed output against ground truth.
Use the results to improve the mission strategy, scan pattern, or viewpoint planner.
Status: Submitted / under review
Authors: Carlos Torre, Kanav Prashar, Bharath Vedantha Desikan, Rodney Staggers Jr., Jnaneshwar Das
Lab: DREAMS Laboratory, Arizona State University
Lab Principal Investigator: Jnaneshwar Das
This paper presents a next-best-view planning pipeline for active 3D reconstruction using an RGB UAV in simulation. Instead of relying on a dense mesh, voxel map, or learned scene representation during flight, the system plans directly from an incrementally updated Structure-from-Motion sparse point cloud.
The mission begins with a structured seed image capture around the target object. After this initial reconstruction, the system enters an adaptive phase where candidate viewpoints are generated, filtered, scored, and selected based on the current state of the sparse model. The goal is to choose the next camera view that best improves the final 3D reconstruction while respecting practical constraints such as viewpoint diversity, travel distance, and redundant observations.
System pipeline. The adaptive loop (Phase 2) iterates until the image budget or KNN convergence threshold is reached
Stratified seed ring (Phase 1) and candidate viewpoint pool (Phase 2) in ENU frame.
The submitted work compares multiple next-best-view scoring strategies:
Co-visibility: prioritizes viewpoints that observe the largest portion of the existing sparse model.
Baseline-aware repair: prioritizes weakly reconstructed regions that need better triangulation support.
Hybrid heuristic: switches between scoring strategies using sparse-model statistics.
Hybrid oracle: uses ground-truth supervision to estimate the best possible switching behavior.
The best-performing oracle hybrid strategy achieved the strongest overall reconstruction fidelity, including a 59.8 mm mean cloud-to-cloud error, 80% completeness at an 80 mm threshold, and 89% F-score at an 80 mm threshold on the simulated lunar rock reconstruction task. The results suggest that sparse Structure-from-Motion can be used as a lightweight planning state for image-budgeted digital twin construction in autonomous robotic systems.
Overlay of Ground Truth (white) and Hybrid-Oracle dense cloud (blue)