Self-Recovery Prompting

Self-Recovery Prompting: Promptable General Purpose Service Robot System with Foundation Models and Self-Recovery

Mimo Shirasaka, Tatsuya Matsushima, Soshi Tsunashima, Yuya Ikeda, Aoi Horo, So Ikoma, Chikaha Tsuji,
Hikaru Wada, Tsunekazu Omija, Dai Komukai, Yutaka Matsuo, and Yusuke Iwasawa

TRAIL Group from Matsuo-Iwasawa Lab
The University of Tokyo

https://arxiv.org/abs/2309.14425

Abstract

A general-purpose service robot (GPSR), which can execute diverse tasks in various environments, requires a system with high generalizability and adaptability to the tasks and environments. In this paper, we first developed a top-level GPSR system in worldwide competition (RoboCup@Home 2023) based on multiple foundation models, which is not only generalizable to the variations but also adaptive by prompting each model. By analyzing the performance of the developed system, we found three types of failure in more realistic GPSR application settings; insufficient information, incorrect plan generation, and plan execution failure. We then propose the self-recovery prompting pipeline, which explores necessary information and modifies their prompts to recover from failure. We experimentally confirm that the system with the self-recovery mechanism can accomplish tasks by resolving various failure cases.

Paper

https://arxiv.org/abs/2309.14425

Approach

Foundation Models-Based System

Foundation models, a set of large pre-trained models with diverse datasets, enable high generalization performances in perception and task planning from natural language to robotics. By adding text description to the input (called prompting) about the contexts, such as detailed instruction and environmental information, they can be more adaptive to various tasks and environments.

To realize a general-purpose service robot system, we leveraged multiple foundation models with high generalization and adaptability for the system. Whisper, GPT-4, Detic, CLIP, and CLIP-Fields (a model that consists of an integration of foundation models) have the ability to enhance the system to be generalized and adaptive with prompting. The figure below shows an example of how the foundation models can be used in our proposed system.

Failure Mode Analysis and Self-Recovery Mechanism

Ideally, GPSR can be achieved with complete information about the environment, the ability to generate correct plans (skill sequences), and the perfect execution of the skills in each plan. However, in general, these three assumptions are often violated. We analyzed issues that often occur in GPSR systems and organized the failure modes of GPSR systems into three patterns, namely, insufficient information, incorrect plan generation, and plan execution failure. Then, we propose to add a self-recovery mechanism into the system and evaluate the performance under the settings of the three failure modes.

(M1) Insufficient Information

Failure Situation: Information required for plan generation is missing (e.g., destination of navigation).

Recovery: The system seeks the common sense of the LLM-based planner and/or extracts clue information from human-robot interaction.

(M2) Incorrect Plan Generation

Failure Situation: The generated plan is incorrect, possibly due to a mistranscription of commands or a lack of reasoning performance or common sense of the planner.

Recovery: We leverage prompts with the speech recognition module, or the system generates a plan with updated prompts.

(M3) Plan Execution Failure

Failure Situation: The robot system fails to execute skills in the environment due to the imperfection of the skill execution.

Recovery: The system retries the execution or re-plans with updated prompts.

We evaluated the entire system using seven handcrafted commands. The seven types of commands require recoveries associated with the three failure modes. Experiments were conducted to examine whether the system can recover from each failure mode by leveraging the proposed system. The system was tested in a real-world domestic environment, with HSR (Human Support Robot) developed by Toyota Motor Corporation.

Results

Given the tasks that require the robot system to recover from failures, our self-recovery prompting pipeline successfully recovered from all failure cases. Examples of three failure modes and execution of our system with self-recovery prompting are shown below. The red rectangle box areas indicate failure patterns at each command. The green arrows indicate normal plan transitions, and the red arrows indicate a recovery plan has been triggered. Blue text indicates the information to navigate is sufficient, and red text indicates the information to navigate is insufficient.

Experiment Videos

We provide the robot's performance on command 1 to 7. The corresponding failure modes, and its concrete explanations are also offered in the video. Here, the video is shown in the actual speed of flow.

cmd1_full.mp4

Command 1

Could you bring me an apple from the side table?

cmd2_full.mp4

Command 2

Hi HSR, I am starting to feel hungry so could you grab an apple from dining table and put it on my desk? I will be there in a moment.

cmd3_full.mp4

Command 3

I lost my mug so could you find it for me?

cmd4_full_2_1.mp4

Command 4

Thank you, HSR. I am getting tired. Could you prepare a fruit for me on the side table? I will have some rest at the sofa in a moment.

cmd5_short.mp4

Command 5

Could you help me find the apple that I bought the other day? Ashley might know where it is, so maybe you can ask her if she knows where it is. When you find it, please bring it to me.

cmd6_short.mp4

Command 6

Could you bring me the apple from the stair-like shelf?

cmd7_short.mp4

Command 7

Could you look for Ashley in the dining room and ask her if she wants dinner at home tonight?

Supplementary Notes Regarding the Videos (Command 1-7)

Regarding Commands 1 and 2, the occurrence of Failure Mode 1 was expected because LLM occasionally suggested locations that did not exist. By updating prompts as follows, such LLM behavior was no longer seen; thus, M1 is no longer present in the videos.

Prompt Change

Before: “Things are organized according to categories in the house. For example, shampoo is in the bathroom, and drinks are in the refrigerator.”
After: “Things are organized according to categories in the house.”

Extra Video

We challenged our robot system to perform what we thought was a little more complex task in a more realistic situation in daily life and put it on a video. In the case of failure occurrence, the robot recovers from it, continues to perform, and completes the order in the end. We hope it brings people in various fields a better, concrete image of life with assistance robots, which may come soon.

cmdsp_full_2.mp4

Special Command

Okay HSR, you have loads of missions. To start with, I bought a bottle of tea for Ashley, so please receive it and take it to the desk and encourage Ashley. Next, take Red Bull to Robin on the sofa, talk to him to ensure he's not sleeping and hand Red Bull over. If possible, ask if anything is bothering him and please inform his answer to me.

Google Sites

Report abuse