Don’t Yell at Your Robot

Physical Correction as the Collaborative Interface 

for Language Model Powered Robots

Chuye Zhang*, Yifei Simon Shao*, Harshil Parekh, Junyao Shi,Pratik Chaudhari, Vijay Kumar, Nadia Figueroa

GRASP Laboratory, University of Pennsylvania

Abstract

We present a novel approach for enhancing human-robot collaboration using physical interactions for real-time error correction of large language model (LLM) parameterized commands. 

The robot leverages an LLM to proactively executes 6 DoF linear Dynamical System (DS) commands using a description of the scene in natural language.Unlike other methods that rely on verbal or text commands, during motion, a human can provide physical corrections, used to re-estimate the desired intention, also parameterized by linear DS. This corrected DS can be converted to natural language and used as part of the prompt to improve future LLM interactions.

We provide proof-of-concept result in a hybrid real+sim experiment, showcasing physical interaction as a new possibility for LLM powered human-robot interface.

System Overview

 The LLM is provided with current semantic scene description from perception module and the previous interaction history. It outputs semantic action that is converted to DS action by the interface manager. This DS action then drives the manipulator by updating the particles. If the human physically corrects the robot, the DS actions are re-estimated based on uniform DS action priors and converted to semantic corrections for the LLM to improve subsequent interactions.


Experiment


Our aim is to test the ability of the proposed method to proactively assist humans to achieve a multi-step task with physical correction and the interaction history.

This is a multi-step task of cooking beans is executed by the LLM-powered robot with physical human correction.