Learning Perceptual Concepts by Bootstrapping from Human Queries


Robots need to be able to learn concepts from their users in order to adapt their capabilities to each user’s unique task. But when the robot operates on high-dimensional inputs, like images or point clouds, this is impractical: the robot needs an unrealistic amount of human effort to learn the new concept. To address this challenge, we propose a new approach whereby the robot learns a low-dimensional variant of the concept and uses it to generate a larger data set for learning the concept in the high-dimensional space. This lets it take advantage of semantically meaningful privileged information only accessible at training time, like object poses and bounding boxes, that allows for richer human interaction to speed up learning. We evaluate our approach by learning prepositional concepts that describe object state or multi-object relationships, like above, near, or aligned, which are key to user specification of task goals and execution constraints for robots. Using a simulated human, we show that our approach improves sample complexity when compared to learning concepts directly in the high-dimensional space. We also demonstrate the utility of the learned concepts in motion planning tasks on a 7-DoF Franka robot.

3 Minute Overview

Concept Examples

Throughout our experiments, we synthesize queries in our simulator by manipulating pairs of objects: an anchor and a moving object, to which the concept is applied with respect to the anchor. We investigate 9 prepositional concepts that arise in robot motion planning:

  1. above: whether moving is above the anchor based on the angle they make with the world's up direction;

  2. abovebb: whether moving is above the anchor based on how much area overlap there is between their ground plane projections;

  3. near: whether the two objects are near each other, as defined by the inverse distance between them;

  4. upright: whether moving is upright with respect to the world frame;

  5. alignedhoriz: whether the objects are aligned along their horizontal dimension;

  6. alignedvert: whether the objects are aligned along their vertical dimension;

  7. forward: whether moving is in the forward hyperplane of the anchor;

  8. front: whether moving is in front of the anchor;

  9. top: whether moving is above the top of the anchor;

The above figure showcases qualitative visualizations of our concepts, with 4 examples of the concept value being positive and one example of it being negative.

Using Concepts for Goal Specification

We use the learned concepts for specifying goal constraints in experiments with a 7DoF Franka Panda robot arm with real point cloud data. We generate goal positions for the moving object using the Cross-Entropy Method, using the concept loss as the cost function.

Concept above: place the mug above the cookie box

Concept near: move the yellow box near the red one

Concept upright: position the cup to be upright

Concept front: move the mug in front of the hammer

Concept top: place the mug atop the bowl