Discovering Motor Programs by Recomposing Demonstrations

We present results related to our paper "Discovering Motor Programs by Recomposing Demonstrations" below. In particular, this site serves to visualize dynamic results, including GIFs and videos, that may not be effectively viewed in the paper.

Visualizing the Learnt Space of Motor Programs

The first result we present is a dynamic visualization of the embedding of the latent representation of motor programs, as depicted in Figure 3 (left) of our paper. Each frame in the following video corresponds to one learnt primitive, and is positioned at the corresponding embedded location of its latent variable in the embedded space.

Press play on the bottom left corner of the video to play. We recommend zooming in to this webpage to view individual learnt primitives in video. Scrolling across the webpage when zoomed in is also useful.

Note the similarity of learnt primitives within each of the clusters.

Visualizing Emergent Primitives

We now present dynamic visualizations of the motor primitives that emerge from our model. These primitives are visualized in an unrolled manner in Figure 3 (right) of our paper, and displayed below as GIFs.

Reaching Primitive

Reaching Primitive 2

Sliding / Pushing Primitive

Bimanual Pulling Primitive

Twisting Primitive

Returning Primitive

Executing Primitives on Real Baxter

We would like the emergent primitives to be useful on a real Baxter robot platform, and be suitably smooth, feasible, and correspond to the same motions executed in simulation (i.e. be unaffected by the noise of execution on a real robot platform). To this end, we execute a small set of primitives on the real robot platform, and visualize the results below as GIFs.

Left Handed Reaching Primitive

Left Handed Returning Primitive

Right Handed Returning Primitive

Left Handed Pushing Primitive

Right Handed Pushing Primitive

Right Handed Twisting Primitive

Visualizing Learnt RL Policies

In section 4.4 of the paper, we trained a policies over our motor primitives that needed to predict a sequence of motor primitives to execute reaching and pushing tasks. We provide videos of the policies solving each of those tasks, after having been trained by Reinforcement Learning (using Proximal Policy Optimization).

For the reaching task, the goal for the robot is to get its end-effector within an epsilon-ball of the green dot. The value of epsilon used in our case is 0.05m. Note that the model learns to use the learnt motor primitives (which are each a continuous motion by themselves).

On the pushing task, the objective is to push the red block to within a similar epsilon-ball of the green dot goal location. The PPO baseline learns to push the block towards the goal using it's arm in a non-prehensile manner, while our method instead learns to use it's gripper to push the block towards the goal. The solution our approach converges to is biased towards similar actions observed in the training set, and are thus much more natural and desirable.

Visualizing Compositions of Primitives

We have displayed individual trajectories being executed above. Below, we display some compositions of primitives being executed to generate trajectories that correspond to their demonstrations. These visualizations serve to visualize how combinations of primitives appear when executed. We show a collection of these visualizations on both the real robot as well as simulation. For each visualization, we show an original demonstration on the left, and the corresponding predicted composition of primitives to "recompose" the demonstration on the right.

Real Robot Results

Trajectory 1: The trajectory consists of reaching with the right hand, then the left hand, followed by a bimanual sliding motion, and returning motions. The predicted composition captures the overall structure of the demonstration quite well, and there is little to no jerkiness between primitives.