Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation
Suraj Nair, Eric Mitchell, Kevin Chen, Brian Ichter, Silvio Savarese, Chelsea Finn
Stanford University | Robotics at Google
Suraj Nair, Eric Mitchell, Kevin Chen, Brian Ichter, Silvio Savarese, Chelsea Finn
Stanford University | Robotics at Google
Our goal is to learn language-conditioned visuomotor skills on real robots.
To do so we:
Label highly sub-optimal offline robot data (including autonomous exploration data or replay buffers of previously trained RL agents) with crowd-sourced natural language annotations.
Learn (1) a language-conditioned reward function from the annotated data and (2) a visual dynamics model from the offline data and actions.
Perform model predictive control with the learned dynamics and reward to complete language specified tasks from visual inputs.