VACE

Virtual Annotated Cooking Environment

About

We present the Virtual Annotated Cooking Environment (VACE), a new open-source virtual reality dataset and simulator for object interaction tasks in a rich kitchen environment. 

 We use the Unity-based VR simulator to create thoroughly annotated video sequences of a virtual human avatar performing food preparation activities. Based on the MPII Cooking 2 dataset, it enables the recreation of recipes for meals such as sandwiches, pizzas, fruit salads and smaller activity sequences such as cutting vegetables. For complex recipes, multiple samples are present, following different orderings of valid partially ordered plans. The dataset includes an RGB and depth camera view, bounding boxes, object masks segmentation, human joint poses and object poses, as well as ground truth interaction data in the form of temporally labeled semantic predicates (holding, on, in, colliding, moving, cutting).

Features


RGB View

Depth View

Segmentation Mask

Dataset Statistics


Single Sample Description

|   sample_readme.txt     # Contains information about sample number, dish number, participant number, dish variant|       # and a verbal description of the dish variant+---RecordingsFiles |   +---Annotations |   |   +---BoundingBox|   |   |       bounding_box_1.json # JSON file with bounding box information for 200 frames of the sample.|   |   |       bounding_box_200.json   # Structure: frame --> object --> {name, id_no, x_max, x_min, y_max, y_min}|   |   |       bounding_box_400.json|   |   |       bounding_box_600.json|   |   |       bounding_box_800.json|   |   |       ...|   |   |       |   |   +---Colormap|   |   |       colormap.json # JSON file with color code information of all objects {name, r, g, b, a, id_no}, used for segmentation pictures|   |   |       colormap1.txt # Same information as txt file|   |   |       |   |   +---PoseAndOrientation|   |   |       position_and_orientation_1.json # JSON file with position and orientation of all objects for 200 frames of the sample.|   |   |       position_and_orientation_200.json # Structure: frame: {frame_number, time, delta_time} --> object: {name, posX, posY, posZ, angX, angY, angZ}|   |   |       position_and_orientation_400.json|   |   |       position_and_orientation_600.json|   |   |       position_and_orientation_800.json|   |   |       ...|   |   |       |   |   \---Predicates|   |           cuts.json # JSON file describing in which frame which object got cut, at which contact point with which cutting direction|   |           grasps.json # JSON file describing which object was grasped or released by which hand (left/right) in which frame|   |           in.json # JSON file describing in which frame which "inside object" entered/exited which "container object"|   |           on.json # JSON file describing in which frame which "top object" started/ended touching which "bottom object"|   |           push.json # JSON file describing which hand (left/right) pushed which other object (without grasping it)|   |           |   \---Videos|       +---Cam1|       |       depth-1.png |       |       depth-2.png|       |       depth-3.png|       |       depth-4.png|       |       depth-5.png|       |       depth-6.png|       |       ...|       |       rgb-1.jpg|       |       rgb-2.jpg|       |       rgb-3.jpg|       |       rgb-4.jpg|       |       rgb-5.jpg|       |       rgb-6.jpg|       |       ...|       |       segmentation-1.png|       |       segmentation-2.png|       |       segmentation-3.png|       |       segmentation-4.png|       |       segmentation-5.png|       |       segmentation-6.png|       |       ...|       |       video-depth.avi # Video of depth camera |       |       video-rgb.avi # Video of rgb camera|       |       video-segmentation.avi # Video of segmentation mask camera|       |       \---ReplayFiles    +---Cuts    |       Cuts1.txt # Cut information in txt format    |           +---LeftHand    |       lhPO1.txt # Left hand pose and orientation information in txt format    |       lhPO200.txt    |       lhPO400.txt    |       lhPO600.txt    |       lhPO800.txt    |       ...    |           +---Particles    |       particles1.txt # Particle emitters status in txt format (i.e., whether 4 stove plates and water tap is on/off)    |       particles200.txt    |       particles400.txt    |       particles600.txt    |       particles800.txt    |       ...    |           +---PositionAndOrientation    |       PO1.txt # Object pose and orientation in txt format    |       PO200.txt    |       PO400.txt    |       PO600.txt    |       PO800.txt    |       ...    |           \---RightHand            rhPO1.txt # Left hand pose and orientation information in txt format            rhPO200.txt            rhPO400.txt            rhPO600.txt            rhPO800.txt            ...            

Download

Citation

@inproceedings{koller2022new,

  title={A New VR Kitchen Environment for Recording Well Annotated Object Interaction Tasks},

  author={Koller, Michael and Patten, Timothy and Vincze, Markus},

  booktitle={Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction},

  pages={629--633},

  year={2022}

}



Acknowledgements

The research leading to these results has received funding from the Austrian Science Fund (FWF) under grant agreement No. I3969-N30 InDex and the project  Doctorate College TrustRobots by TU Wien. 

Contact Us

If you have a feature request or have recorded samples that you would like to add to the dataset, write us an email!

Maintainer: Michael Koller - (koller_michael@gmx.net)