ARNOLD Challenge

CVPR 2024 Embodied AI Workshop

Overview

Hosted on CVPR 2024 Embodied AI Workshop, the ARNOLD Challenge is a robotic manipulation challenge that benchmarks language-grounded task learning in realistic interactive 3D scenes, featuring an understanding of continuous object states and systematic evaluation along various aspects. The ARNOLD Challenge comprises eight language-conditioned tasks covering the manipulation of rigid objects, articulated objects, and fluids. It also provides generalization tests regarding novel objects, novel scenes, and novel goal states. For more details on the ARNOLD benchmark, see the paper (ICCV 2023) and project page.

Features

Guidelines

We host the challenge on EvalAI. To participate in the current ongoing challenge, follow these steps:

EvalAI Phases

EvalAI. We host the evaluation servers on EvalAI, which is developed by the CloudCV team. EvalAI is an open-source web platform designed for organizing and participating in challenges to push the state of the art on AI tasks.

Data splits. We divide the evaluation data into three splits, including dev, test-challenge, and test-reserve. Dev is used for debugging and validation, allowing for a maximum of 10 submissions per day. It does not contain generalization tests, i.e., novel objects, novel scenes, and novel goal states. Test-challenge is the default test data for the ARNOLD challenge, serving as the standard for comparison and providing a public leaderboard that is updated upon submission. Test-reserve is used to protect against possible overfitting. If there are substantial differences between a method's scores on test-standard and test-reserve, it will raise a red flag and prompt further investigation. Results on test-reserve will not be publicly revealed.

Phases. We maintain two phases for the ARNOLD challenge accordingly: Dev and Test. The Dev phase uses data from the dev split, and the Test phase uses data from test-challenge and test-reserve splits. We encourage participants to first submit to the Dev phase for sanity check and development feedback. Note that the submission procedures are identical for the two phases. All Public submissions to the Test phase will be considered participating in the challenge. Private submissions or submissions to the Dev phase will not be considered towards challenge participation. 

Timeline

Translate and rotate rigid objects

Open and close articulated objects (drawer, prismatic joint)

Open and close articulated objects (cabinet, revolute joint)

Manipulate liquids

Organizers

Citation

@inproceedings{gong2023arnold,  title={ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes},  author={Gong, Ran and Huang, Jiangyong and Zhao, Yizhou and Geng, Haoran and Gao, Xiaofeng and Wu, Qingyang and Ai, Wensi and Zhou, Ziheng and Terzopoulos, Demetri and Zhu, Song-Chun and others},  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},  year={2023}}