The starred authors(*) are co-first authors and contributed equally.

Integration of Robot Motion and LLM based on Predictive Coding Approach


    Interaction using natural language is essential for enabling flexible and generalizable robot behavior in real-world environments. With the advancement of Large Language Models (LLMs), there has been increasing interest in integrating language understanding with robot motion generation. However, in many previous studies, language models and robot motion learning frameworks are treated as separate modules, resulting in limited consistency between linguistic instructions and generated actions. This separation often prevents robots from fully leveraging semantic information during motion generation.    This research aims to tightly integrate language understanding and robot motion learning by constructing a shared latent space that connects sensorimotor representations and linguistic information. Specifically, the proposed method introduces sensorimotor attention mechanisms and language-based regression within shared latent variables, enabling mutual alignment between motion and language modalities [1]. By embedding both modalities into a unified representation space, the model can generate robot actions that are consistent with natural language instructions while maintaining sensitivity to sensory inputs. This approach establishes a framework for more coherent and embodied language-conditioned robot behavior, bridging the gap between LLM-based reasoning and low-level motor control.
  1. Kanata Suzuki, Tetsuya Ogata: Sensorimotor Attention and Language-based Regressions in Shared Latent Variables for Integrating Robot Motion Learning and LLM, Proceedings of 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'24), pp.11872-11878, acceptance rate 47.5%, Abu Dhabi, UAE, October 14-18, 2024.