Stable Dishware Pushing via Convolutional Neural Networks
Presented in IROS 2023 (Poster) [Link to Abstract] [Link to Poster] [Link to Demo]
Presented in IROS 2023 (Poster) [Link to Abstract] [Link to Poster] [Link to Demo]
Abstract
Pushing objects is a valuable technique for manipulating large or unwieldy objects when gripping them is not feasible. This is particularly applicable in tasks like dish clearance, where pushing wider dishes instead of grasping them is a practical choice for relocation. However, the challenge lies in performing analytical push planning without knowledge of important physical characteristics such as friction coefficient and center of friction. To overcome this challenge, we propose a supervised learning approach for stable planar pushing of dishware with unknown physical properties. The approach utilizes convolutional neural networks (CNN) to evaluate the probability of a successful push based on the depth image of the object and a planar pushing direction. The model's output, combined with the Hybrid A* algorithm, enables the planning of a push path that ensures stable relocation to a desired location. Training data is sampled from various domains, including dish object meshes, friction coefficients, and dish poses, to ensure robust performance. Experimental results show that the trained model and path planning algorithm robustly handle uncertainties in object properties, with average relocation success rate of 82%.
Learning to Evaluate the Push
When employing pushing, the robot must ensure the object be stably placed on the end-effector. However, the pushing involves intricate physical dynamics, including factors such as the geometry of the object, pressure distribution, and friction properties of its surfaces. Therefore, accurately predicting the subsequent motion of the object during pushing becomes challenging. To address this, we introduce a learning-based method to stably push the dishes. We apply convolutional neural networks (CNN) to speculate stable push directions for a given push contact.
The network takes a depth image of the slider and push velocities as input. In the case of depth image, the push contact position is translated to a quarter height, rotated for the push contact head upward, and cropped to 64x64 pixel. This image processing implicitly includes the push contact pose information to the slider depth image. Next, push velocity is randomly sampled from 1000 pre-sampled unit velocities in the velocity sphere. The network architecture includes four convolutional layer comprises a series of four convolutional layers arranged in two pairs, each pair being separated by ReLU nonlinearities. This is followed by three fully connected layers. Additionally, the network contains a sepearate input layer for unit pushing velocity.
The processed features from the image input and velocity input are merged, goes through a single fully connected layer and estimates the push success value Q.
Synthetic Dataset Generation
In this research, Isaac Gym simulation was implemented to derive the push results. Since this simulation can take advantage of GPU to deploy multiple environments in parallel, it can make large amount of training data at once. In result, total 500,000 data points were generated within 24 hours of running simulation. Simulation demo can be seen through the video on the right.
Train data points were randomly sampled independently from each domain distribution. For example, friction coefficient distribution ranges from 0.1 to 0.5. In addition, in the case of slider mesh distribution, since this network is developed to push dishwares, 140 dish models were sampled from a variety of mesh datasets from ShapeNet and YCB dataset. For the camera pose, we provided a perturbation of its intitial pose, with 0.1m of height error and 10 degrees of z-euler angle of its orientation.
Stable Push Path Planning
Push success value estimating network can work in conjunction with Hybrid A* algorithm when planning a stable push path. While searching through the grid search space , Hybrid A* expands its nodes according to unit displacement and steering angle and updates its cost. Within the nodes expanded by the displacement and steering of the heading, the algorithm finds the shortest path. The trained network can find the stable ICRs or stable push directions by iteratively evaluating the push with different push velocity inputs. Stable push directions are defined using a push success metric function $S(s,c,v)$ defined in the following equation.
The push success metric deems the given push direction as successful if the corresponding network output is larger than the threshold=0.5. The push directions which Q = 1 are deemed as stable pus directions. From stable ICRs that lead to non-holonomic pushing, it is possible to derive the maximum steering angle that ensures stable pushing. By letting the Hybrid A* algorithm to expand nodes according to the stable steering angles, it is possible to derive a non-holonomic stable push path that can successfully relocate the slider
Experiment and Result
Total 500,000 simulation-based training datapoints from 1500 dishes were made, using Isaac Gym simulator. We trained the network for 10 epochs, and the network reached accuracy of 91%. In the real experiment, five circular dishes and one elliptical dish were pushed with Robotiq-2f gripper. For circular dishes, all pushes were successful for dishes smaller than 165mm of diameter, while success rate declined as the dish's size grows. The success rate of elliptical dish was the lowest; while most of the pushes succeeded for pushes along its shortest diameter, some failed for pushes along its largest diameter. Future training of the network with additional training data of irregular shapes of dishes can realize more robust pushing of dishware. Moreover, the model can also be extended to stably push planar objects other than dishware, with more genealized train dataset.
Future Work
The given research has been conducted on a limited variety of dishware, thus, further object relocation experiments on a more diverse range of plates are planned. Post these experiments, the intention is to compare the success rate of the learned model in actual pushing scenarios with simulations regarding each pushing direction. After conducting these additional experiments, a submission to RA-L is planned.
In the current model, a depth image of the local part of the object is received as model input, making it challenging to understand the global features of the object. According to a previous paper titled "Vision-based Stable 2D Planar Pushing of Dishware with 6-DOF Manipulator", the overall shape of the object plays a significant role in determining a stable push direction. Therefore, future research aims to expand a model to take the entire point cloud of the object as input, not just for plates but for stably pushing arbitrary objects. This research is planned to be submitted to RA-L under the title "Deep One-Step Planar Pushing of Unknown Objects".
Lastly, regardless of the ability to adeptly push a single plate, relocating it to the desired location becomes challenging when multiple plates are collocated due to obstructions. To address this, the problem will be transformed into a task planning issue. The distances between objects will be modeled as if they were springs, with the goal to extend each 'spring' to a normal length, thus widening the distance between collocated objects. This approach aims to explore task planning to address the challenges posed by the presence of multiple objects in close proximity, enabling more effective object relocation and manipulation in cluttered environments. This future work aims to submit to ICRA 2024 workshop under the title "Spring-Inspired Fast Relocation Planning of Collocated Objects"