Composing Diffusion Policies for Few-shot Learning of Movement Trajectories
Omkar Patil, Anant Sah and Nakul Gopalan

Paper link: https://arxiv.org/abs/2410.17479

🎉 Our paper was accepted to Compositional Learning workshop @ NeurIPS 2024!

🎉 We presented our poster in the Brain over Brawn workshop @ IROS 2024!

Abstract

Humans can perform various combinations of physical skills without having to relearn skills from scratch every single time. For example, we can swing a bat when walking without having to re-learn such a policy from scratch by composing the individual skills of walking and bat swinging. Enabling robots to combine or compose skills is essential so they can learn novel skills and tasks faster with fewer real world samples. Our goal here is to learn robot motions few-shot and not necessarily goal oriented trajectories. Unfortunately we lack a general purpose metric to evaluate the error between a skill or motion and the provided demonstrations. Hence, we propose a probabilistic measure - Maximum Mean Discrepancy on the Forward Kinematics Kernel (MMD-FK), that is task and action space agnostic. By using our few-shot learning approach DSE, we show that we are able to achieve a reduction of over 30% in MMD-FK across skills and number of demonstrations. Moreover, we show the utility of our approach through real world experiments by teaching novel trajectories to a robot in 5 demonstrations.

Base Policy Priors

demo_wall.mp4

Policy Rollouts

Real-world Experiments

We collect real world data on the Franka robot. A lot of noise gets introduced in the process as the demonstrations are collected by manually moving the robot end-effector while keeping the robot in zero-gravity mode. Superior performance of DSE in the real world setting shows that it is especially useful for noisy demonstrations.

Data Collection

Demo.mp4

Vanilla Composition

PC.mp4

Training on few-shot demonstrations

FT.mp4

Diffusion Score Equilibrium

DSE.mp4

DSE clearly does the best of all three, while the vanilla composition approach matches to line_x using the optimization procedure.

snake.mp4

While vanilla composition fails, DSE matches the reference trajectory closely, and the model trained on the few-shot demos does not follow the expected trajectory.

spring.mp4

Simulated Experiments

In these visualizations, we show the flow of all the control points used to calculate the metric MMD-FK in our work. The points near the base of the robot see lesser movement compared to the points near the end-effector. The reference trajectory image on the left only shows the movement of the end-effector for multiple rollouts, unless specified otherwise.

DSE achieves good similarity with the reference, while trained model makes higher errors in the initial parts of the trajectory by shifting upwards.

spiral.mp4

Although the trained model looks similar at first, in fact it is the wrong trajectory as it goes upwards first. DSE achieves the closest result to the reference, while vanilla composition matches to the line_x prior.

step.mp4

Here we show the flow of all the control points on the robot to better illustrate the oscillation of the end-effector about a fixed point. Unlike DSE, the trained model makes errors in the initial part of the trajectory by shifting upwards.

line_osc_x.mp4

Page updated

Report abuse

Composing Diffusion Policies for Few-shot Learning of Movement TrajectoriesOmkar Patil, Anant Sah and Nakul Gopalan

Base Policy Priors

Policy Rollouts

Real-world Experiments

Simulated Experiments

Composing Diffusion Policies for Few-shot Learning of Movement Trajectories
Omkar Patil, Anant Sah and Nakul Gopalan