We provide the full videos for the 5-frame renders seen in Fig. 2.
Pick
Place
Open (Drawer)
Close (Fridge)
These videos offer visual examples for some Pick and Place failure modes. The videos below do not cover all failure or success modes for either subtask.
Can't grasp failure
Drop failure
Mobility failure
Too slow failure
Didn't reach goal failure
Place in goal failure
Drop to goal failure
Won't let go failure
Ours: Policy must pull Cracker Box out of sink without colliding with sink edge or other objects (otherwise box might fall).
M3: Policy hovers above cluttered receptacle and relies on magical grasp to lift (teleport) the target object. Video taken from https://sites.google.com/view/hab-m3
To compare performance, we run an altered version of Behavior-1k’s rendering benchmark. We use a single Nvidia RTX 4090, render 1 128x128 RGB-D image, and simulate dynamics with a simulation frequency of 120Hz and control frequency of 30Hz. Each evaluation run consists of 300 steps of random actions clipped to [-0.3, 0.3]. We report mean and 95% CIs over 10 evaluation runs.
While live-rendering with ray tracing, ManiSkill-HAB achieves 69.90 ±0.25 samples per second (SPS) while using 6.26 ±0.00 GB of GPU memory, while Behavior-1k is limited to 19.92 ±0.04 SPS while using 7.62 ±0.04 GB of GPU memory.
Hence, ManiSkill-HAB is 3.51x faster than Behavior-1k while using 17.85% less GPU memory, while also retaining similar ray-tracing render quality.
Below, we provide a comparison of live-rendered ray-traced images between ManiSkill-HAB (left) and Behavior-1k (right).
Live render from ManiSkill-HAB. Users need only change one line in the code.
Live render obtained from the Behavior-1k official Colab demo notebook.
@inproceedings{shukla2025maniskillhab,
author = {Arth Shukla and Stone Tao and Hao Su},
title = {ManiSkill-HAB: {A} Benchmark for Low-Level Manipulation in Home Rearrangement Tasks},
booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025},
publisher = {OpenReview.net},
year = {2025},
url = {https://openreview.net/forum?id=6bKEWevgSd},
timestamp = {Thu, 15 May 2025 17:19:05 +0200},
biburl = {https://dblp.org/rec/conf/iclr/ShuklaTS25.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}