Visuomotor policies for manipulation have demonstrated remarkable potential in modeling complex robotic behaviors. Yet, minor alterations in the robot’s initial configuration and unseen obstacles easily lead to out-of-distribution observations, resulting in catastrophic execution failures, unless considerably amount of effort is spent on data collection. In this work, we introduce an effective data augmentation framework that produces both, visually-realistic image sequences and corresponding physically-feasible action trajectories from real-world egocentric task demonstrations collected from a portable data collection device equipped with a single fisheye camera. We introduce a novel Gaussian-splatting formulation, adapted for fisheye cameras, to reconstruct and edit the 3D scene with unseen objects. We utilize trajectory optimization to generate smooth, collision-free, view-rendering-friendly action trajectories and render visual observations from novel views. Comprehensive experiments in simulation and the real world show that our augmentation framework improves the success rate for various manipulation tasks in both the same scene and the augmented scene with obstacles requiring active collision avoidance.
Completed the task 7/20 trials.
Completed the task 18/20 trials.
Completed the task 1/20 trials
Completed the task 20/20 trials
How does our obstacle augmented policy handle more challenging obstacle placement, such as this wine bottle 🍾 ?
Completed 10/10 trials
Completed the task 0/10 trials
Completed the task 0/10 trials