ARNOLD Challenge
CVPR 2024 Embodied AI Workshop
Overview
Hosted on CVPR 2024 Embodied AI Workshop, the ARNOLD Challenge is a robotic manipulation challenge that benchmarks language-grounded task learning in realistic interactive 3D scenes, featuring an understanding of continuous object states and systematic evaluation along various aspects. The ARNOLD Challenge comprises eight language-conditioned tasks covering the manipulation of rigid objects, articulated objects, and fluids. It also provides generalization tests regarding novel objects, novel scenes, and novel goal states. For more details on the ARNOLD benchmark, see the paper (ICCV 2023) and project page.
Features
Built upon Omniverse Isaac Sim, photo-realistic rendering and physics-realistic simulation
20 diverse scenes and 40 diverse objects, meticulously optimized materials and collisions, etc.
8 language-grounded manipulation tasks, covering the skills of manipulating rigid objects, articulated objects, and fluids
Diverse language-grounded task instructions encompassing continuous object states
Large-scale demonstrations (10k)
Systematic evaluation, including in-domain test and generalization test for novel objects/scenes/states
Guidelines
We host the challenge on EvalAI. To participate in the current ongoing challenge, follow these steps:
Refer to the Github codebase and Challenge Setup Guide to setup a workspace, prepare data, and reproduce the results.
Log into EvalAI (register if necessary) and go to the ARNOLD challenge page. Read the details and select a phase.
Launch training and evaluation in the workspace with your own model.
Make your own submissions. You need to create a team in order to make submissions. Register your team or join an existing team by using their invite code.
EvalAI Phases
EvalAI. We host the evaluation servers on EvalAI, which is developed by the CloudCV team. EvalAI is an open-source web platform designed for organizing and participating in challenges to push the state of the art on AI tasks.
Data splits. We divide the evaluation data into three splits, including dev, test-challenge, and test-reserve. Dev is used for debugging and validation, allowing for a maximum of 10 submissions per day. It does not contain generalization tests, i.e., novel objects, novel scenes, and novel goal states. Test-challenge is the default test data for the ARNOLD challenge, serving as the standard for comparison and providing a public leaderboard that is updated upon submission. Test-reserve is used to protect against possible overfitting. If there are substantial differences between a method's scores on test-standard and test-reserve, it will raise a red flag and prompt further investigation. Results on test-reserve will not be publicly revealed.
Phases. We maintain two phases for the ARNOLD challenge accordingly: Dev and Test. The Dev phase uses data from the dev split, and the Test phase uses data from test-challenge and test-reserve splits. We encourage participants to first submit to the Dev phase for sanity check and development feedback. Note that the submission procedures are identical for the two phases. All Public submissions to the Test phase will be considered participating in the challenge. Private submissions or submissions to the Dev phase will not be considered towards challenge participation.
Timeline
04/01/2024: Submission opens
06/02/2024: Submission ends
06/18/2024: Winners announced at CVPR 2024 Embodied AI Workshop
Translate and rotate rigid objects
Open and close articulated objects (drawer, prismatic joint)
Open and close articulated objects (cabinet, revolute joint)
Manipulate liquids
Organizers
Citation