The dataset includes 20 cooking actions, involving either a single or both arms of the volunteer, some of them including tools which may require different forces. Three different view-points have been considered for the acquisitions, i.e. lateral, egocentric, and frontal. For each action a training and a test sequence is available, each containing, on average, 25 repetitions of the action. Furthermore, acquisitions of more structured activities are included, in which the actions are performed in sequence for a final, more complex goal.
An annotation is available, which includes the segmentation of single action instances in terms of time instants in the MoCap reference frame. A function then allows to map the time instants on the corresponding frame in the video sequences. In addition, functionalities to load, segment, and visualize the data are also provided in Python and Matlab.
Cutting the bread
Shredding a carrot
Cleaning a dish
Eating
Beating eggs
Squeezing a lemon
Mincing with a mezzaluna
Mixing in a bowl
Opening a bottle
Turning the frittata in a pan
Pestling
Pouring water in multiple containers
Pouring water in a mug
Reaching an object
Rolling the dough
Washing the salad
Salting
Spreading cheese on a slice of bread
Cleaning the table
Transporting an object
Authors using this code in their pubblications should cite this paper:
"The MoCA dataset, kinematic and multi-view visual streams of fine-grained cooking actions"
E. Nicora, G. Goyal, N. Noceti, A. Vignolo, A. Sciutti, F. Odone Scientific Data 7 (1), 1-15