Deep Transfer Learning of Pick Points on Fabric for Robot Bed-Making

By Daniel Seita et al. If you have questions on how use the code or the project in general, email me at seita@berkeley.edu.

Updates:

  • 05/31/2019: The paper has been submitted to ISRR 2019.
  • 09/16/2019: The paper has been accepted and we have submitted the final version.

Here, you can find the paper, data, and videos.

This research was conducted at the AUTOLAB at UC Berkeley. http://autolab.berkeley.edu/

BibTex:

@inproceedings{seita_bedmake_2019,
    author = {Daniel Seita and Nawid Jamali and Michael Laskey and Ajay Kumar Tanwani and Ron Berenstein and Prakash Baskaran and Soshi Iba and John Canny and Ken Goldberg},
    title = {{Deep Transfer Learning of Pick Points on Fabric for Robot Bed-Making}},
    booktitle = {International Symposium on Robotics Research (ISRR)},
    Year = {2019}
}

Code:

Data:

  • Here is the data you can use for training the grasp network: https://drive.google.com/open?id=1l5Aup1AnMNPzlcRVYZu9GYjBOsD9D-z2
  • And here is the data you can use for training the success network: https://drive.google.com/open?id=1D2OUHKOY26TetsBodmtMhAxCKT65ZDU6 [Note: we used to emphasize a 'success / transition network' for this research, but we de-emphasized it in favor of a better understanding of just the grasp (or pick point) network. However, we leave the data here for completeness. The transition network was getting 99%-ish accuracy so it was nearly perfect, but we could have used alternative methods (e.g., whether a marker is covered or not) which is why we de-emphasized the discussion in subsequent research.]

If you unzip the data, you'll get stuff that looks like this for the grasp network and then the success network:


$ ls -lh cache_combo_v03total 11G-rw-rw-r-- 1 nobody nogroup 1.1G Sep 11 17:18 grasp_list_of_dicts_nodaug_cv_0_len_210.pkl-rw-rw-r-- 1 nobody nogroup 1.1G Sep 11 17:18 grasp_list_of_dicts_nodaug_cv_1_len_210.pkl-rw-rw-r-- 1 nobody nogroup 1.1G Sep 11 17:19 grasp_list_of_dicts_nodaug_cv_2_len_210.pkl-rw-rw-r-- 1 nobody nogroup 1.1G Sep 11 17:19 grasp_list_of_dicts_nodaug_cv_3_len_210.pkl-rw-rw-r-- 1 nobody nogroup 1.1G Sep 11 17:19 grasp_list_of_dicts_nodaug_cv_4_len_210.pkl-rw-rw-r-- 1 nobody nogroup 1.1G Sep 11 17:20 grasp_list_of_dicts_nodaug_cv_5_len_210.pkl-rw-rw-r-- 1 nobody nogroup 1.1G Sep 11 17:20 grasp_list_of_dicts_nodaug_cv_6_len_210.pkl-rw-rw-r-- 1 nobody nogroup 1.1G Sep 11 17:21 grasp_list_of_dicts_nodaug_cv_7_len_209.pkl-rw-rw-r-- 1 nobody nogroup 1.1G Sep 11 17:21 grasp_list_of_dicts_nodaug_cv_8_len_209.pkl-rw-rw-r-- 1 nobody nogroup 1.1G Sep 11 17:22 grasp_list_of_dicts_nodaug_cv_9_len_209.pkl$ ls -lh cache_combo_v03_success/total 6.1G-rw-rw-r-- 1 nobody nogroup 624M Sep 11 19:17 success_list_of_dicts_nodaug_cv_0_len_115.pkl-rw-rw-r-- 1 nobody nogroup 629M Sep 11 19:17 success_list_of_dicts_nodaug_cv_1_len_114.pkl-rw-rw-r-- 1 nobody nogroup 607M Sep 11 19:17 success_list_of_dicts_nodaug_cv_2_len_114.pkl-rw-rw-r-- 1 nobody nogroup 621M Sep 11 19:18 success_list_of_dicts_nodaug_cv_3_len_114.pkl-rw-rw-r-- 1 nobody nogroup 618M Sep 11 19:18 success_list_of_dicts_nodaug_cv_4_len_114.pkl-rw-rw-r-- 1 nobody nogroup 616M Sep 11 19:18 success_list_of_dicts_nodaug_cv_5_len_114.pkl-rw-rw-r-- 1 nobody nogroup 630M Sep 11 19:19 success_list_of_dicts_nodaug_cv_6_len_114.pkl-rw-rw-r-- 1 nobody nogroup 615M Sep 11 19:19 success_list_of_dicts_nodaug_cv_7_len_114.pkl-rw-rw-r-- 1 nobody nogroup 611M Sep 11 19:20 success_list_of_dicts_nodaug_cv_8_len_114.pkl-rw-rw-r-- 1 nobody nogroup 627M Sep 11 19:20 success_list_of_dicts_nodaug_cv_9_len_114.pkl


These are split into 10 pickle files for each network. Each of the 10 files (per neural network) is a standard pickle file (requires python 2.7 to load, not python 3, sorry, we were forced to use this due to code dependencies) which represents a standard python list. The lists have length equal to whatever is listed in the pickle file name. The data has already been shuffled and split into cross validation folds, which is what the `cv` represents in the file names. Of course, during training, the data needs to be further shuffled for minibatches. Each data point in the lists is a dictionary with `c_img` and `d_img` keys representing RGB and depth images. The latter is what we want to use. These also have labels within the dictionary.

Coverage Results:

VIDEO SUBMISSION

The video below is the main video we submitted.

OTHER VIDEOS

This shows a clip of a successful rollout with the teal blanket. Taken with an iPhone, so it's a bit shaky.

This shows one of the earlier failure cases with the offsets not being set correctly. This led us to adjust the offset of the gripper height as needed. In the future, we will also use a soft pad instead of a hard bed frame top.

Here's why the analytic baseline often gets good coverage. It doesn't grasp a corner but grasps a point such that the pull results in high coverage.