Research showcase: Sugano lab & Ogata lab, Waseda University

In this page, we will introduce 2 cooking-related research from Sugano lab and Ogata lab, Waseda Univerisity, Japan. 

The humanoid robot, nextage open performs ingredients scooping and pouring tasks, using a turner or ladle to transfer an ingredient from a pot to a bowl. The robot recognizes object characteristics and serves even when the target ingredients are unknown (e.g. milk, grains, fish etc.). 

We focus on active perception using multimodal sensorimotor data while a robot interacts with ingredients and allows the robot to recognize their extrinsic (shape, size, colour etc.) and intrinsic (weight, friction, viscosity etc.) characteristics. We construct a deep neural networks model that learns to recognize ingredient characteristics, acquires tool–object–action relations, and generates motions for tool selection and handling. 

This work was presented in IEEE RA-L and ICRA 2021 and got the best paper in cognitive robotics [1].

[1] Namiko Saito, Tetsuya Ogata, Satoshi Funabashi, Hiroki Mori and Shigeki Sugano, "How to select and use tools?: Active Perception of Target Objects Using Multimodal Deep Learning", In the IEEE Robotics and Automation Letters (RA-L), vol. 6, no. 2, pp. 2517-2524, 2021, doi: 10.1109/LRA.2021.3062004.

2. Stir-frying scrambled egg

  The humanoid robot Dry-AIREC cooks scrambled eggs using real ingredients,  which the robot needs to perceive the states of the egg and adjust stirring movement in real-time, while the egg is heated and the state changes continuously. It could perform cooking eggs with unknown ingredients. The robot could change the method of stirring and direction depending on the status of the egg, as in the beginning it stirs in the whole pot, then subsequently, after the egg started being heated, it starts flipping and splitting motion targeting specific areas, although we did not explicitly indicate them.

  In previous works, handling changing objects was found to be challenging because sensory information includes dynamical, both important or noisy information, and the modality which should be focused on changes every time, making it difficult to realize both perception and motion generation in real time. We propose a predictive recurrent neural network with an attention mechanism that can weigh the sensor input, distinguishing how important and reliable each modality is, that realizes quick and efficient perception and motion generation. 

  Please check the preliminary work in ArXiv [2]. Additionally, Waseda University with Moonshot Program goal 3 will demonstrate this work in ICRA, so please check it out in the venue!

[2] Namiko Saito, Mayu Hiramoto, Ayuna Kubo, Kanata Suzuki, Hiroshi Ito, Shigeki Sugano, Tetsuya Ogata, "Realtime Motion Generation with Active Perception Using Attention Mechanism for Cooking Robot," arXiv 2023. arxiv.org/abs/2309.14837 


(Writer: Namiko Saito, 15th/Mar/2024)