RoboMorph: In-Context Meta-Learning for Robot Dynamics Modeling

Manuel Bianchi Bazzi¹, Asad Ali Shahid², Christopher Agia³, John Alora³, Marco Forgione², Dario Piga², Francesco Braghin¹, Marco Pavone³, Loris Roveda²'³

¹Politecnico di Milano, ²IDSIA, ³Stanford

Paper

Code

ABSTRACT

The landscape of Deep Learning has experienced a major shift with the pervasive adoption of Transformer-based architectures, particularly in Natural Language Processing (NLP). Novel avenues for physical applications, such as solving Partial Differential Equations and Image Vision, have been explored. However, in challenging domains like robotics, where high non-linearity poses significant challenges, Transformer-based applications are scarce. While Transformers have been used to provide robots with knowledge about high-level tasks, few efforts have been made to learn dynamics or perform system identification.

This paper proposes a novel methodology to learn a meta dynamical model of a high-dimensional physical system, such as the Franka robotic arm, using a Transformer-based architecture without prior knowledge of the system's physical parameters. The objective is to predict quantities of interest (end-effector pose and joint positions) given the torque signals for each joint. This prediction can be useful as a component for Deep Model Predictive Control frameworks in robotics. The meta-model establishes the correlation between torques and positions and predicts the output for the complete trajectory. This work provides empirical evidence of the efficacy of the in-context learning paradigm, suggesting future improvements in learning the dynamics of robotic systems without explicit knowledge of physical parameters

Appendices

Training Datasets

To challenge the model's capability, datasets were generated in various compositions and topologies. The configurations of datasets is represented in the Figure below.

Qualitative overview

Each test case represents the dataset collected with 1000 robots each. The performance of the models is evaluated on both in-distribution (ID) and slightly out-of-distribution (OOD) scenarios. Although the overall performance is generally higher for ID cases, it is notable that in terms of R^2, both test_5 and test_6, characterized by a lower master frequency, show similar results. The smallest dataset, Dataset1, requires approximately an hour of training.

Context-prediction ratio

The context used for the prediction is 20% as shown in Figure below.

RoboMorph: In-Context Meta-Learning for Robot Dynamics Modeling

ABSTRACT

Appendices

Training Datasets

Qualitative overview

Context-prediction ratio

Fine-Tuning

Isaac Gym Simulation Setup

CITATION