MimicGen is a system for automatically generating large-scale, rich datasets from only a small number (~10) of human demonstrations by adapting them to new contexts.Â
We used MimicGen to generate over 50,000 demonstrations from less than 200 human demonstrations across 18 tasks, multiple simulators, and the real-world. This took considerably less human effort than prior work.
10 Human Demos
Generated Dataset
(Nominal Variant)
Generated Dataset
(Greater Variability)
Generated Dataset
(Greatest Variability)
Three Piece Assembly
Square
Stack Three
Coffee
Threading
Mug Cleanup
10 human demos
1000 generated demos across 12 mugs
10 human demos (Panda)
1000 generated demos (Sawyer)
1000 generated demos (IIWA)
1000 generated demos (UR5e)
Coffee Preparation
Kitchen
Pick Place
Gear Assembly
Frame Assembly
Mobile Kitchen
200 MimicGen demos on Stack in large region
100 MimicGen demos on Coffee in large region
Stack Three D1 (91%)
Coffee D1 (93%)
Threading D1 (80%)
Square D1 (69%)
Three Piece Assembly D1 (61%)
Mug Cleanup O2 (67%)
Kitchen D1 (78%)
Coffee Preparation D1 (59%)
Mobile Kitchen D0 (77%)
Nut-and-Bolt Assembly D1 (96%)
Gear Assembly D1 (76%)
Frame Assembly D1 (71%)
Below, we show MimicGen datasets generated on Square D2 from two sets of source datasets - 10 demos from a better quality operator and 10 demos from a worse quality operator (both are from the robomimic Square MH dataset). Surprisingly, policies trained on each dataset achieve comparable results, which suggests that in the large-scale data regime, data quality might not matter as much.
Dataset generated from 10 demos from a better quality operator.
Dataset generated from 10 demos from a worse quality operator.
Below, we show policies trained on 200 demos on Square D0 -- on the left, the policy was trained on 200 demos generated by MimicGen from 10 human demos, and on the right, the policy was trained on 200 human demos. The agent performance is comparable. This raises important questions about the presence of redundancies in large human datasets and when to request additional data from a human.
Square D0 Policy (90.7%) trained on 200 MimicGen demos generated from 10 human demos.
Square D0 Policy (90.7%) trained on 200 human demos.
The video above shows an example of MimicGen using a source human trajectory to generate a demonstration on a new scene.
Stack
Stack Three
Square
Coffee
Threading
Three Piece Assembly
Hammer Cleanup
Mug Cleanup
Pick Place
Nut Assembly
Kitchen
Coffee Preparation
Mobile Kitchen
Nut-and-Bolt Assembly
Gear Assembly
Frame Assembly
Stack (Real)
Coffee (Real)
Stack D0
Stack D1
Stack Three D0
Stack Three D1
Square D0
Square D1
Square D2
Coffee D0
Coffee D1
Coffee D2
Threading D0
Threading D1
Threading D2
Three Piece Assembly D0
Three Piece Assembly D1
Three Piece Assembly D2
Hammer Cleanup D0
Hammer Cleanup D1
Mug Cleanup D0
Mug Cleanup D1
Pick Place D0
Nut Assembly D0
Mobile Kitchen D0
Kitchen D0
Kitchen D1
Coffee Preparation D0
Coffee Preparation D1
Nut-and-Bolt Assembly D0
Nut-and-Bolt Assembly D1
Nut-and-Bolt Assembly D2
Gear Assembly D0
Gear Assembly D1
Gear Assembly D2
Frame Assembly D0
Frame Assembly D1
Frame Assembly D2
Stack Real D0
Stack Real D1
Coffee Real D0
Coffee Real D1