Multi-task robot data for dual-arm fine manipulation
Heecheol Kim, Yoshiyuki Ohmura, Yasuo Kuniyoshi
In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024
In the field of robotic manipulation, deep imitation learning is recognized as a promising approach for acquiring manipulation skills. Additionally, learning from diverse robot datasets is considered to be a viable method to achieve versatility and adaptability. In such research, by learning various tasks, robots achieved generality across multiple objects. However, such multi-task robot datasets have mainly focused on single-arm tasks that are relatively imprecise and not addressed the fine-grained object manipulation that robots are expected to perform in the real world. In this study, we introduce a dataset for diverse object manipulation that includes dual-arm tasks and/or tasks that require fine manipulation. We generated a dataset containing 224k episodes (150 hours, 1,104 language instructions) that includes dual-arm fine tasks, such as bowl-moving, pencil-case opening, and banana-peeling. This dataset is publicly available. Additionally, this dataset includes visual attention signals, dual-action labels that separate actions into robust reaching trajectories or precise interactions with objects, and language instructions, all aimed at achieving robust and precise object manipulation. We applied the dataset to our Dual-Action and Attention, which is a model that we designed for fine-grained dual-arm manipulation tasks that is robust to covariate shift. We tested the model in over 7k trials for real robot manipulation tasks, which demonstrated its capability to perform fine manipulation.