Discovering Motor Programs by Recomposing Demonstrations
We present results related to our paper "Discovering Motor Programs by Recomposing Demonstrations" below. In particular, this site serves to visualize dynamic results, including GIFs and videos, that may not be effectively viewed in the paper.
Visualizing the Learnt Space of Motor Programs
The first result we present is a dynamic visualization of the embedding of the latent representation of motor programs, as depicted in Figure 3 (left) of our paper. Each frame in the following video corresponds to one learnt primitive, and is positioned at the corresponding embedded location of its latent variable in the embedded space.
Press play on the bottom left corner of the video to play. We recommend zooming in to this webpage to view individual learnt primitives in video. Scrolling across the webpage when zoomed in is also useful.
Note the similarity of learnt primitives within each of the clusters.
Visualizing Emergent Primitives
We now present dynamic visualizations of the motor primitives that emerge from our model. These primitives are visualized in an unrolled manner in Figure 3 (right) of our paper, and displayed below as GIFs.
Reaching Primitive
Reaching Primitive 2
Sliding / Pushing Primitive
Bimanual Pulling Primitive
Twisting Primitive
Returning Primitive
Executing Primitives on Real Baxter
We would like the emergent primitives to be useful on a real Baxter robot platform, and be suitably smooth, feasible, and correspond to the same motions executed in simulation (i.e. be unaffected by the noise of execution on a real robot platform). To this end, we execute a small set of primitives on the real robot platform, and visualize the results below as GIFs.
Left Handed Reaching Primitive
Left Handed Returning Primitive
Right Handed Returning Primitive
Left Handed Pushing Primitive
Right Handed Pushing Primitive
Right Handed Twisting Primitive
Visualizing Learnt RL Policies
In section 4.4 of the paper, we trained a policies over our motor primitives that needed to predict a sequence of motor primitives to execute reaching and pushing tasks. We provide videos of the policies solving each of those tasks, after having been trained by Reinforcement Learning (using Proximal Policy Optimization).
For the reaching task, the goal for the robot is to get its end-effector within an epsilon-ball of the green dot. The value of epsilon used in our case is 0.05m. Note that the model learns to use the learnt motor primitives (which are each a continuous motion by themselves).
On the pushing task, the objective is to push the red block to within a similar epsilon-ball of the green dot goal location. The PPO baseline learns to push the block towards the goal using it's arm in a non-prehensile manner, while our method instead learns to use it's gripper to push the block towards the goal. The solution our approach converges to is biased towards similar actions observed in the training set, and are thus much more natural and desirable.
Visualizing Compositions of Primitives
We have displayed individual trajectories being executed above. Below, we display some compositions of primitives being executed to generate trajectories that correspond to their demonstrations. These visualizations serve to visualize how combinations of primitives appear when executed. We show a collection of these visualizations on both the real robot as well as simulation. For each visualization, we show an original demonstration on the left, and the corresponding predicted composition of primitives to "recompose" the demonstration on the right.
Real Robot Results
Trajectory 1: The trajectory consists of reaching with the right hand, then the left hand, followed by a bimanual sliding motion, and returning motions. The predicted composition captures the overall structure of the demonstration quite well, and there is little to no jerkiness between primitives.
Original Demonstration
Predicted Composition of Primitives
Trajectory 2: This trajectory consists of a reaching with the right hand, a short sliding motion, followed by a returning motion. As noted above the predicted composition of primitives is seamless and quite natural.
Original Demonstration
Predicted Composition of Primitives
Trajectory 3: This trajectory consists of a reach with the right, a lifting primitive with the right, a reaching motion with the left hand, followed by returning motions of both hands. The predicted composition seems to begin reaching with its left hand prior to completing the lift with its right hand.
Original Demonstration
Predicted Composition of Primitives
Trajectory 4: This trajectory consists of subsequent reaching primitives with right and left hands, followed by a bimanual pushing primitive, and finally returning primitives for both hands. The predicted primitive sequence doesn't explicitly capture the pushing motion (which is relatively subtle), but otherwise smoothly captures the overall structure of the demonstration.
Original Demonstration
Predicted Composition of Primitives
Visualizing Composition of Primitives: Simulation Results
Original Demonstration
Predicted Composition of Primitives
Original Demonstration
Predicted Composition of Primitives