Navigation with Large Language Models:
Semantic Guesswork as a Heuristic for Planning

Conference on Robot Learning (CoRL) 2023
Atlanta, Georgia

Summary Video

Problem Statement

A robot equipped solely with an egocentric RGB camera is dropped in a previously unseen environment and given an open-vocabulary textual query to reach a goal. How do we best leverage the semantic information stored in LLMs to guide the robot in a new environment?

Main Idea

Our key insight is to use LLM reasoning via chain-of-thought, and incorporate it as a planning heuristic,
rather than naively executing the actions suggested by the LLM.










 



Exploring Real-World Environments with LLMs

lfg_videos_1.mp4
lfg_videos_3.mp4
lfg_videos_2.mp4

Leveraging Positive and Negative Scores

LFG scores subgoals with an empirical estimate of the likelihoods by sampling an LLM multiple times with both positive and negative prompts. Each LLM responses uses chain-of-thought to obtain reliable scores and interpretable insight into the decisions.

Negative scores push the agent away from exploring regions not likely to include the goal while positive scores pull the agent toward regions where finding the goal is more likely.

BibTeX

@inproceedings{shah2023lfg,
title={Navigation with Large Language Models: Semantic Guesswork as a Heuristic for Planning},
author={Dhruv Shah and Michael Equi and Blazej Osinski and Fei Xia and Brian Ichter and Sergey Levine},
booktitle={7th Annual Conference on Robot Learning},
year={2023},
url={https://openreview.net/forum?id=PsV65r0itpo}}