Robust Learning-Based Incipient Slip Detection using the PapillArray Optical Tactile Sensor for Improved Robotic Gripping
Robust Learning-Based Incipient Slip Detection using the PapillArray Optical Tactile Sensor for Improved Robotic Gripping
Qiang Wang Pablo Martinez Ulloa Robert Burke David Cordova Bulens Stephen J. Redmond
University College Dublin
This website correspondence for Stephen J. Redmond: stephen.redmond@ucd.ie
Declaration: We affirm that this website is ONLY affiliated with this paper.
Abstract
The ability to detect slip, particularly incipient slip, enables robotic systems to anticipate potential hazards during object gripping and take corrective measures to prevent dropping the object. Therefore, slip detection enhances the overall stability of robotic gripping. However, accurately detecting incipient slips remains a significant challenge.
We propose a novel learning-based approach to detect incipient slip using the PapillArray (Contactile, Australia) tactile sensor. The resulting model is highly effective in identifying patterns associated with incipient slip, achieving a detection success rate of 95.6% when tested with an offline dataset. Furthermore, we introduce several data augmentation methods to enhance the robustness of our model. When transferring the trained model to a robotic gripping environment distinct from where the training data was collected, our model maintained robust performance, with a success rate of 96.8%, providing timely feedback for stabilizing several practical gripping tasks.
Our method can effectively enhance the stability of robot gripping through incipient slip detection:
Q & A
Acknowledgement: The content in this section originates from our discussion and rebuttal phase with the IEEE Robotics and Automation Letters (RA-L) reviewers. Due to the 8-page limit of the RA-L article, we could not incorporate all the detailed content into the paper. However, we highly appreciate the value of these reviews. Therefore, we present it here in a Q&A format for interested readers. To protect the anonymity of the reviewers, we have rephrases some of the questions in our own words but keeping the original meaning of the question.
Q1: Why did you choose to categorize as 'incipient' versus 'other' instead of using more classes?
A1:During the initial phase of our work, we did consider categorizing the outcome state into several categories, including "non-contact," "static," "incipient slip," and "gross slip." However, we ultimately decided not to adopt this classification approach for the following reasons: 1) In the context of our research, the detection of the "incipient slip" state is the most important, whereas the significance of the other states is relatively lower in relation to our objectives. For both the "static" and "non-contact" states, there is no specific corrective action required. Furthermore, if a "gross slip" occurs, it undoubtedly follows an "incipient slip." Therefore, we only initiate corrective actions when we detect an "incipient slip." Additionally, annotating all of these states would undoubtedly result in an increased workload; 2) Minimize feature confusion. In the "static" state, all 9 pillars exhibit similar movement tendencies when in contact with objects, mirroring the behavior seen in "gross slip". Categorizing both under different labels could lead to ambiguity during learning. 3) Adopting a binary "one-vs-others" classification strategy can be beneficial [1]. This choice is based on several theoretical considerations: a. multi-class tasks involve a greater number of categories, potentially necessitating more complex models to differentiate between them. b. The decision boundaries in multi-class scenarios might be more intricate compared to binary ones. c. Given the broader range of categories in multi-class tasks, they often demand more data to capture the nuances of each category, whereas binary classification focuses on two main classes, thus generally requiring less data.
Q2: How does the performance of your neural network architecture compare to others, such as the Multi-layer Perceptron (MLP)?
A2:In fact, our initial strategy was to use a Multi-Layer Perceptron (MLP), which is structurally simpler. We divided the original entire time series into small windows, then let the MLP determine whether there was an incipient slip within this window. However, when using the MLP, we encountered a problem: the number of false positives (FP) in the classifier's output was too high. In other words, it had difficulty effectively distinguishing between 'stop' and 'slip'. This might be because the MLP primarily focuses on fitting static non-sequential data, lacking the capability to capture the internal structure of time series. More specifically, the MLP tends to classify based on the overall trend of force changes across all pillars, without dynamically considering the temporal relationships between pillars. In contrast, Recurrent Neural Networks (RNNs) are specifically designed for time series. They take into account past data to decide current outputs, thus dynamically capturing the temporal relationships between different pillars. Therefore, for our specific scenario, RNNs demonstrated superior capabilities.
Q3: It would be interesting to delve deeper into the connection between the weights of objects and slip detections. From an intuitive perspective, the time interval of incipient slip for heavier objects is much shorter. This poses a challenge to the suggested method in terms of latency and stop events.
A3: The point you've brought up aligns somewhat with an experiment we initially planned for this paper. However, in the end, we opted not to include it. In this experiment, we positioned an object on a supporting base, applied a relatively small force to grip the object, and then promptly removed the supporting base to induce incipient slip caused by gravity. As the weight of the object increased, its downward acceleration also grew, making the tendency for incipient slip more pronounced, ultimately leading to a shorter time interval, as you suggested. However, during experimentation, we encountered challenges in consistently replicating the process of removing the supporting base from various objects in a straightforward manner while also ensuring that the removal action itself did not trigger incipient slip, which could potentially affect the experimental results. This was difficult to control because of the varying sliding friction between the supporting base and objects, which resulted from different materials. Therefore, we have chosen not to include these experiments in this paper because we believe they could introduce unfair variables when comparing different objects. Therefore, in our experiment introduced in Section IV-B-1, we endeavored to simulate the trend of triggering incipient slip in objects of varying weights, while also ensuring fair replication on different objects. To achieve this, we pushed the objects onto a fixed tabletop, establishing different levels of acceleration and velocity. Larger accelerations were employed to replicate the tendency of heavier objects to shift their balance, whereas smaller accelerations were set for lighter objects. Consequently, we can reasonably assert that we have taken this factor into consideration. In the case of the experiment introduced in Section IV-C, the weight of the object will not have a direct and fundamental impact on the experimental results. This is because we have initially verified that the gripping force is sufficiently small, and without corrective interventions, the object will even not leave the tabletop or will do so with very minimal displacement. Therefore, the weight of the object is not the root cause behind the incipient slip observed in this experiment.
Reference:
[1] Allwein, E. L., Schapire, R. E., & Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of machine learning research, 1(Dec), 113-141.