Bagging by Learning to Singulate Layers Using Interactive Perception

Lawrence Yunliang Chen, Baiyu Shi, Roy Lin, Daniel Seita, Ayah Ahmad

Richard Cheng, Thomas Kollar, David Held, Ken Goldberg

2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023)

Paper

Link: arxiv:2303.16898 [cs.RO]

For details on action primitives and perception module for segmentation, please also see our prior work:

AutoBag: Learning to Open Plastic Bags and Insert Objects, ICRA 2023 (Paper) (Website)

Abstract

Many fabric handling and 2D deformable material tasks in homes and industry require singulating layers of material such as opening a bag or arranging garments for sewing. In contrast to methods requiring specialized sensing or end effectors, we use only visual observations with ordinary parallel jaw grippers. We propose SLIP: Singulating Layers using Interactive Perception, and apply SLIP to the task of autonomous bagging. We develop SLIP-Bagging, a bagging algorithm that manipulates a plastic or fabric bag from an unstructured state and uses SLIP to grasp the top layer of the bag to open it for object insertion. In physical experiments, a YuMi robot achieves a success rate of 67% to 81% across bags of a variety of materials, shapes, and sizes, significantly improving in success rate and generality over prior work. Experiments also suggest that SLIP can be applied to tasks such as singulating layers of folded cloth and garments.

How many layers does the robot grasp in each of the 3 examples below?

Take a guess!

Click on the videos to find out!

toss_0layer.mp4

toss_1layer.mp4

toss_2layer.mp4

Singulating Layers using Interactive Perception (SLIP)

As we see from the above example, after the robot has performed a grasp, it cannot easily determine how many layers it grasped from visual inputs of a static scene. However, by moving the gripper and observing how the bag is moved with the top layer, the robot can infer how many layers it has grasped.

Formally, SLIP requires 3 components: a cyclic trajectory of the robot gripper, a video classification model, and an iterative height adjustment algorithm. The robot has an overhead camera, and by using the video classification model to determine how many layers it grasps, SLIP iteratively adjusts the gripper height and retries the grasp if it does not successfully grasp a single layer.

slip_3rdperson_view_2x.mp4

Example of SLIP in action. In the first two attempts, the robot barely pinched the top layer of the bag and failed to obtain a tight single-layer grasp. The video classification model recognized that, and the robot was able to successfully grasp a single layer after adjusting the gripper height to go 1 mm deeper in each iteration.

Some More Examples

0 layer is usually easy to tell, but the difference between a 1-layer grasp and a 2-layer grasp can be subtle. Sometimes the movement of the entire bag and not just the local region around the gripper is needed for the robot to infer how many layers it grasps.

0-Layer Grasps

1-Layer Grasps

2-Layer Grasps

SLIP-Bagging Algorithm

Using SLIP, we present an algorithm for opening deformable bags, which we call SLIP-Bagging. It manipulates a plastic or fabric bag from an unstructured state, and uses SLIP to grasp the top layer of the bag to open it for object insertion.

SLIP-Bagging algorithm. (1) The robot starts with an unstructured bag with objects on the side. (2) SLIP-Bagging then flattens the bag, and (3) uses SLIP to grasp the top layer of the bag, followed by (4) insertion and (5) bag lifting. A trial is a full success if the robot lifts the bag with all items in it.

Physical Experiments

We evaluate SLIP-Bagging on 8 Bags across 4 categories. For each category, we evaluate one that was used in perception model training, the other unseen in training.

Goal: Insert 6 rubber ducks*.

*We thank Ryan Burgert for providing us with the rubber ducks, and please rest assured that no rubber ducks are harmed during the experiments.

Top: Training bags. Bottom: Test bags.

Below, we show videos for each of the test bags.

Test_SoftPlastic_2x.mp4

Soft plastic bag

ShengKeePlastic_2x.mp4

Stiff plastic bag

Test_Drawstring_2x.mp4

Mesh drawstring bag

Test_Handbag_2x.mp4

Fabric handbag

Some Success Videos on Training Bags

Train_SoftPlastic_10x.mp4

Soft plastic bag (Train)

Cactus_bag_success_10x.mp4

Fabric handbag (Train)

Failure Modes

failure_layers_rolled_inside_16x.mp4

Failure to successfully grasp a single layer of the bag due to the two layers being rolled underneath the bag.

failure_slip_1layer_grasp_16x.mp4

Bag slips out of the gripper after grasping a single layer.

failure_knock_handle_16x.mp4

Robot hand hits the bag handles during insertion and does not put the objects inside the bag.

Hyperparameter Values

There are 2 hyperparameters in the SLIP-Bagging algorithm: p_small and p_large. If the bag area is smaller than p_small fraction of the maximum area of the bag when fully flattened, SLIP-Bagging chooses a Shake action. If the bag area is between p_small and p_large fractions of the maximum area of the bag, SLIP-Bagging chooses a Dilate/Fling action. If the bag area is larger than p_large fractions of the maximum bag area, it is considered large enough.

For thin plastic bags, we choose p_small = 0.35 and p_large = 0.75. For drawstring bags, we choose p_small = 0.6 and p_large = 0.85. For thick plastic bags and fabric bags, we choose p_small = 0.7 and p_large = 0.85. This is because for thin plastic bags, shaking has a limited effect to straighten the bag as the material is very soft and tends to wrinkle, so more dilation is needed. It is also more challenging to achieve a very flattened state. For thicker materials, shaking is efficient and effective, so we select a higher threshold. The stiffness of drawsting bags falls in between. A higher value of p_large means being more conservative about ensuring the bag is fully flattened but increases the number of actions, and we find 80-85% to be a good threshold in general.

SLIP for Grasping Fabrics and Garments

We test SLIP on other materials to evaluate its applicability to general single-layer grasping tasks. We consider 3 deformable objects: a blue piece of cloth folded twice into a square, a white dress, and a red hat with multiple layers. The task goal is to grasp their top layer only.

The robot pins at the region indicated by the green point and attempts to grasp a single layer at the yellow star region.

0-Layer grasp

1-Layer grasp

2-Layer grasp

cloth_5x.mp4

Folded Cloth

dress_5x.mp4

Dress

hat_5x.mp4

Hat

Acknowledgements

This research was performed at the AUTOLAB at UC Berkeley in affiliation with the Berkeley AI Research (BAIR) Lab, and the CITRIS ``People and Robots'' (CPAR) Initiative. The authors were supported in part by donations from Toyota Research Institute and equipment grants from NVIDIA. Lawrence Yunliang Chen is supported by the National Science Foundation (NSF) Graduate Research Fellowship Program under Grant No. 2146752. Daniel Seita and David Held are supported by NSF CAREER grant IIS-2046491. We thank Ryan Burgert for providing the rubber ducks for our experiments and Kaushik Shivakumar for giving us valuable feedback.