TL;DR: Making deep learning based perception safer to integrate into robotic systems through probabilistic reasoning and shared autonomy.
This paper presents SPIRIT -- a shared autonomy interface for teleoperating aerial manipulators in industrial applications. A novel feature of SPIRIT is its probabilistic software stacks, allowing users to interact with a remote robot through an explicit representation of uncertainty in deep learning (DL). Specifically, from perception modules that pervasively rely on DL nowadays, we obtain the uncertainty estimates to decide the level of autonomy, i.e., the users can transition from semi-autonomous manipulation to haptic teleoperation if the robot's perception is highly uncertain. This way, we integrate uninterpretable DL methods more safely while leveraging their state-of-the-art performance in the real world. Empirically, we provide a comparative analysis to motivate our design choices and conduct a user study to evaluate robustness gains in achieving aerial manipulation tasks. As a result, we demonstrate how probabilistic approaches are viable options for robotic manipulation problems that involve learning in the real world.
Approach
Generally speaking, the goal of our work is to advance aerial manipulation capabilities of our robots. For this, we ground our applied work in concrete industrial use cases. Here, we address two problems. First, we extend the mobility of inspection crawler robots. Because, these robots have magnetic wheels, the gaps in the pipelines and bumps create huddles for their mobility. By performing pick and place tasks with an aerial manipulator, the mobility of crawler robots can be extended. The second use case is closing of the industrial flange valves, which has never been demonstrated with an aerial robot, to the best of our knowledge.
Typically, a robot perceives the environments and then generate actions using the information of the perception system. Nowadays, perception systems more and more rely on deep learning, which is uninterpreted and black box models. Questions exist on how to integrate these systems into a physical robots in a robust and reliable manner.
Autonomous driving is divided into levels from 0 to 5 based on how much the car can drive itself. At Level 0, there's no automation — the driver does everything. Level 5 is full self-driving — the car can go anywhere, anytime, with no human input needed at all. Levels in between are different levels of assitive functionalities being activated.
SPIRIT brings similar idea to robotic manipulation, where the level of autonomy is varying depending on the uncertainty estimates of deep learning based perception module. If robotic perception is confident, meaning the estimated quantities are more likely to be correct, we use the assisted functions for performance. Otherwise, we switch to pure teleoperation for robustness.
One of the challenges is on developing a deep learning based perception system, which can estimate its uncertainty well. We use a depth-based approach to perception, where manipulation targets are estimated by matching two point clouds. One idea is to partition one point cloud so that the matching process becomes simpler. An example is provided below.
Without partitioning. Imagine finding transformations between the two point clouds. More different they are, the problem is harder.
With partitioning. When compared to the problem at left hand side, finding transformation between these two point clouds might be easier.
For the registration itself, we utilize a deep learning model. The model takes two point clouds as an input. Here, the dimension of the point cloud depends on the depth sensing and the point clouds can be dense. Then, after a processing step, we feed the data into a 6D representation learning. The learned representations are then used for multi layer perception, which outputs lie algebra. Lie algebra is a particular parameterization for translation and orientation. The output dimension is six here. An advantage of using lie algebra is that we can use Gaussian distributions for modeling uncertainties. While parameterizations such as euler angles, rotation matrices, etc are constrained, lie algebra resides on euclidean space. Given this construction, we utilize Gaussian Processes for uncertainty quantification. The uncertainty quantification methods are based on the related work (reference 1 in the reading list below).
Colored example
Depth example
Instances example
Training data is collected and zero-shot transfer is used. Some examples from the training data pipeline are shown above, which is based on BlenderProc2.
Real World Experiments
SPIRIT was originally developed for an industrial exhibition. Clips in the laboratory as well as videos from the live demonstrations at an industrial exhibition are displayed below. Videos are blurred whenever necessary for anonomyous policies.
Closing of an industrial valve is demonstrated in the laboratory.
Closing of an industrial valve is demonstrated at an exhibition.
Crawler robot is moved from one pipe to another in the laboratory.
Crawler robot is moved from one pipe to another at the exhibition.
Dissemination in industry
To adhere to the double blind review policy, the content is hidden. To be opened once accepted.
Supplementary Materials
Some more readings
"Trust your robots! predictive uncertainty estimation of neural networks with sparse gaussian processes." Conference on Robot Learning. PMLR, 2022.
"Blenderproc2: A procedural pipeline for photorealistic rendering." Journal of Open Source Software 8.82 (2023): 4901.