Navigation with Large Language Models:
Semantic Guesswork as a Heuristic for Planning

Dhruv Shah*, Michael Equi*, Błażej Osiński, Fei Xia, Brian Ichter, Sergey Levine

UC Berkeley, University of Warsaw, Google DeepMind

arXiv | Summary Video | Code | Interactive Colab | BibTeX

Conference on Robot Learning (CoRL) 2023
Atlanta, Georgia

Summary Video

Problem Statement

A robot equipped solely with an egocentric RGB camera is dropped in a previously unseen environment and given an open-vocabulary textual query to reach a goal. How do we best leverage the semantic information stored in LLMs to guide the robot in a new environment?

Main Idea

Our key insight is to use LLM reasoning via chain-of-thought, and incorporate it as a planning heuristic,
rather than naively executing the actions suggested by the LLM.

Use a VLM (e.g., DETIC) to detect objects and describe observations at the frontier.

Cluster the detected objects into groups based on location on a 2d map (metric) or direction (topological) in which the objects are observed

Feed the clusters into an LLM to score each clusters. Scores indiciate how strongly the LLM believes the goal is likely to be found closer to one cluster relative to the others.

Combine the scores from the LLM with any additional costs or heuristics.

Perform A* search using the computed scores.

Exploring Real-World Environments with LLMs

lfg_videos_1.mp4

lfg_videos_3.mp4

lfg_videos_2.mp4

Leveraging Positive and Negative Scores

LFG scores subgoals with an empirical estimate of the likelihoods by sampling an LLM multiple times with both positive and negative prompts. Each LLM responses uses chain-of-thought to obtain reliable scores and interpretable insight into the decisions.

Negative scores push the agent away from exploring regions not likely to include the goal while positive scores pull the agent toward regions where finding the goal is more likely.

BibTeX

@inproceedings{shah2023lfg,
title={Navigation with Large Language Models: Semantic Guesswork as a Heuristic for Planning},
author={Dhruv Shah and Michael Equi and Blazej Osinski and Fei Xia and Brian Ichter and Sergey Levine},
booktitle={7th Annual Conference on Robot Learning},
year={2023},
url={https://openreview.net/forum?id=PsV65r0itpo}}

Navigation with Large Language Models:Semantic Guesswork as a Heuristic for Planning