Overview
Current NLP techniques have been greatly applied in different domains. In this paper, we propose a human-in-the-loop framework for robotic grasping in cluttered scenes, investigating a language interface to the grasping process, which allows the user to intervene by natural language commands. This framework is constructed on a state-of-the-art grasping baseline, where we substitute a scene-graph representation with a text representation of the scene using BERT. Experiments on both simulation and physical robot show that the proposed method outperforms conventional object-agnostic and scene-graph based methods in the literature. In addition, we find that with human intervention, performance can be significantly improved.
Method
Results
Pyhsical Experiment
Bibtex
@inproceedings{song-etal-2022-human,
title = "Human-in-the-loop Robotic Grasping Using {BERT} Scene Representation",
author = "Song, Yaoxian and
Sun, Penglei and
Fang, Pengfei and
Yang, Linyi and
Xiao, Yanghua and
Zhang, Yue",
booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
month = oct,
year = "2022",
address = "Gyeongju, Republic of Korea",
publisher = "International Committee on Computational Linguistics",
url = "https://aclanthology.org/2022.coling-1.265",
pages = "2992--3006",
}