The starred authors(*) are co-first authors and contributed equally.

Integration Learning of Language and Robot Motion

Abstract

    Interaction using natural language is important in collaborative tasks between humans and robots. With the development of LLM, research on robot motion planning and environment recognition is being actively conducted, and the realization of human-collaborative robots is expected. While LLM is an important tool for connecting traditional robotics technology with the real world, in many previous studies, LLM and robot motion controllers have been separated, and the prediction results of the model's predicted results may contradict reality. This research aims to link a language model and a robot motion generation model, and establish a robot control model that can interactively respond to unknown situations from a small amount of motion-language paired data [1,2,3].

Relative Publication

    1. Kazuki Hori, Kanata Suzuki, Tetsuya Ogata: Interactively Robot Action Planning with Uncertainty Analysis and Active Questioning by Large Language Model, arXiv preprint arXiv:2308.15684, 2023.
    2. Minori Toyoda*, Kanata Suzuki*, Yoshihiko Hayashi, Tetsuya Ogata: Learning Bidirectional Translation between Descriptions and Actions with Small Paired Data, IEEE Robotics and Automation Letters, vol.7, no.4, pp.10930-10937, 2022 (presented at IROS'22).
    3. Minori Toyoda, Kanata Suzuki, Hiroki Mori, Yoshihiko Hayashi, Tetsuya Ogata: Embodying pre-trained word embeddings through robot actions, IEEE Robotics and Automation Letters, vol.6, no.2, pp.4225-4232, 2021 (presented at ICRA'21).

Composition of Multiple Robot Subtasks by Designing Dynamical Systems in RNN

Abstract

    For a general-purpose motion generation of robots, selecting and switching motions that are appropriate to the situation is essential. In a human-robot collaboration, robots can assist humans to work in various situations by generating motions based on external instructions. Additionally, robots operating alone should automatically select motions based on the current situation recognized from sensors. However, when the number of situations and variations in the required robot tasks increases, it becomes difficult to design all motion patterns in robotics. Recent studies into robot manipulation using DNN have primarily focused on single tasks. Therefore, we investigated a robot manipulation model that uses DNNs and can execute long sequential dynamic tasks by performing multiple short sequential tasks at appropriate times.     We proposed a method to design a dynamical system in a multi-timescales RNN[1,2]. Although a RNN can embed multiple motion sequences, it is not easy to represent motion switching considering their internal states are usually learned independently for each sequence. In the proposed method, subtask's initial and final robot postures are designed to be common, and the RNN is trained to ensure the internal state can be switched depending on the input. The RNN comprised two kinds of neuron groups with different time constant values: low-level neurons to learn fast-changing dynamics, and high-level neurons to learn slow-changing dynamics. Additionally, we define the learning constraints that bring the values of the internal states of the low-level neurons at the beginning and end of the motion sequence closer together to achieve the explicit motion switching according to the instructions input to the model.    In addition, we proposed a compensation method for the undefined behaviors using a separate controller[3]. In this method, the output of the model is switched to a model-based controller that can perform stably in undefined behaviors. For appropriate controller switching, it is necessary to detect the undefined behaviors. We assume that the internal states of the RNN embeds the information necessary for the task execution and add neurons that can predict the previous motion trajectories to the middle layer of the RNN, which is trained for the main robot task. Furthermore, by comparing the prediction results with the actual motion trajectories, we determine whether or not the current robot posture is included in the training dataset distribution. The proposed method evaluates the internal dynamics without changing the weights of the original model, thereby enabling switching between the undefined and defined behaviors without degrading the task performance of the learning-based controller.

Relative Publication

    1. Kei Kase, Kanata Suzuki, Pin-Chu Yang, Hiroki Mori, Tetsuya Ogata: Put-In-Box Task Generated from Multiple Discrete Tasks by a Humanoid Robot Using Deep Learning, Proceedings of 2018 IEEE International Conference on Robotics and Automation (ICRA'18), pp.6447-6452, acceptance rate 40.6%, Brisbane, Australia, May 21-25th, 2018.
    2. Kanata Suzuki, Hiroki Mori, Tetsuya Ogata: Motion Switching with Sensory and Instruction Signals by designing Dynamical Systems using Deep Neural Network, IEEE Robotics and Automation Letters, vol.3, issue.4, pp.3481-3488, 2018.
    3. Kanata Suzuki, Hiroki Mori, Tetsuya Ogata: Compensation for undefined behaviors during robot task execution by switching controllers depending on embedded dynamics in RNN, IEEE Robotics and Automation Letters, vol.6, no.2, pp.3475-3482, 2021 (presented at ICRA'21).

Learning Dual-Arm Manipulation Tasks for Flexible Objects

Abstract

    In this research, we applied the DNN framework on humanoid robots with a high degree of freedom to perform flexible object manipulation tasks, such as buttoning[1], knotting of rope[2], and folding towels[3,4]. The state of the flexible objects was in constant flux during the operation of the robot. This required the robot control system to dynamically correspond to the state of the object at all times. However, a manual description of appropriate robot motions corresponding to all object states is difficult to prepare in advance. Therefore, we used DNNs that have the capability to self-organize and integrate multiple high-dimensional data. A two-phase DNN model is also utilized in the proposed approach. A deep convolutional autoencoder extracts image features and reconstructs images, and the RNN learns the dynamics of a robot task process from the extracted image features and motion angle signals. We verified the effectiveness of the proposed method by applying it to various tasks.

Relative Publication

    1. Wakana Fujii, Kanata Suzuki, Tomoki Ando, Ai Tateishi, Hiroki Mori, Tetsuya Ogata: Buttoning Task with a Dual-Arm Robot: An Exploratory Study on a Marker-based Algorithmic Method and Marker-less Machine Learning Methods, Proceedings of 2022 IEEE/SICE International Symposium on System Integrations (SII'22),  pp.682-689, Online, January 8-12, 2022.
    2. Kanata Suzuki*, Momomi Kanamura*, Yuki Suga, Hiroki Mori, Tetsuya Ogata: In-air Knotting of Rope using Dual-Arm Robot based on Deep Learning, Proceedings of 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'21), pp.6724-6731, acceptance rate 45%, Online, September 27- October 1, 2021.
    3. Pin-Chu Yang, Kazuma Sasaki, Kanata Suzuki, Kei Kase, Shigeki Sugano, Tetsuya Ogata: Repeatable Folding Task by Humanoid Robot Worker using Deep Learning, IEEE Robotics and Automation Letters, vol.2, no.2, pp.397-403, 2017 (presented at ICRA'17).
    4. 鈴木彼方, 高橋城志, Gordon Cheng, 尾形哲也: 深層学習を用いた多自由度ロボットにおける柔軟物折り畳み動作の生成, 情報処理学会第78回全国大会 (IPSJ'16), 神奈川, 2016年3月10日-12日.

Dynamics Motion Learning for Multi-DOF Flexible-joint Robots

Abstract

    We proposed a pre-train method of DNN models to acquire the robot's body dynamics with flexible joints[1]. In the process of human development, "motor babbling" is defined as the movement that infants use to acquire their own body model. This motion data is known to be effective as pre-traning data for robot learning, however; the vast number of motions is needed to be prepared. In our proposed method, we used Stochastic Multiple Timescale RNN (SMTRNN) that can predict the variance of the learned data and sequentially learn sensory information from motor babbling. This makes it possible to avoid the learning of over-fitted motions. As an experiment, we performed additional learning after pre-training for motor babbling on a robot simulator, OpenHRP3. The result showed that learning task motion was done more efficiently.

Relative Publication

    1. Kuniyuki Takahashi, Kanata Suzuki, Tetsuya Ogata, Hadi Tjandra, Shigeki Sugano: Efficient Motor Babbling Using Variance Predictions from a Recurrent Neural Network, In Neural Information Processing, Lecture Notes in Computer Science (LNCS), vol.9491, pp.26-33, Proceedings of 22th International Conference on Neural Information Processing (ICONIP'15), accepted for oral presentation, Istanbul, Turkey, November 9-12, 2015.