Online RGBD Action Dataset (ORGBD)





 

The dataset targets for human aciton (human-object interaction) recognition based on RGBD video data. There are seven categories of human actions: Drinking, eating, using laptop, reading cellphone, making phone call, reading book, using remote. 

 

The dataset is designed for three evaluation tasks (S# refers to the folder name):

1. same-evnironment aciton recognition (two-fold validation, 1) S1 for training and S2 for testing, 2) S2 for training and S1 for testing)

2. cross-environment action recognition (S1+S2 for training,  S3 for testing)

3. continuous action recognition (S1+S2+S0 for training, S4 for testing)


 

The naming scheme for video file is as follow:

a#i_s#j_e#k:

------------------------------------------------------------------

#i refers to the action category index

#i = 0 - 6: Drinking, eating, using laptop, reading cellphone, making phone call, reading book, using remote

#i = 8:  long video sequence (contains multiple actions)

#i = 10: negative action (without any of the predifined action)

------------------------------------------------------------------

#j refers to the subject index

------------------------------------------------------------------

#k refers to the episode index

------------------------------------------------------------------



Download:

password:15ej



S0: background actions (do not belong to any of seven predefined actions)

S1: video segments from subjects ID: 1-8, each segment refers to one video action instance

S2: video segments from subjects ID: 9-16, each segment refers to one video action instance

S3: video segments from subjects ID: 17-24, each segment refers to one video action instance, the recording environment is different from S1 and S2

S4 (part1, part2): continuous videos from subjects ID: 25-36, each video may contain multiple action instances. 


 

In each folder, 

*.avi refers to the RGB video sequence

*_depth.bin refers to the depth stream. The data format is the same as in MSR Action 3D dataset and MSR Daily Acitivity dataset 

*_skeleton.txt refers to the skeleton stream. The data format is the same as in MSR Action 3D dataset and MSR Daily Acitivity dataset 

*_rgb.avi.Label.txt records the object location in the video (it is not available for S4). 

The first row: version1    #_frames

in the following rows,  the first column refers to the object location is manually labeled (value: 0) or interpolated (value: 1) 

                                         the second column refers to the action category

                                         the next four columns refer to [ x1, y1, x2, y2 ] (object bounding box).




An auxilary code is avaialbe under the code folder (S0), which can be used for visualizing the video data (including RGB, depth, skeleton, objects).

 


 

If you have any question on this dataset, please feel free to drop an email to Gang Yu (iskicy@gmail.com).

 

 

Reference:


If you happen to use the dataset or code, please cite our paper:


Gang Yu, Zicheng Liu, Junsong Yuan

Discriminative Orderlet Mining For Real-time Recognition of Human-Object Interaction

Asian Conference on Computer Vision (ACCV) 2014