Datasets

The workshop introduces two benchmarks specifically designed to assess continual semi-supervised learning on two important computer vision tasks: activity recognition and crowd counting, to later describe the challenges built upon those benchmarks and their rules.

Continual Activity Recognition (CAR)

As a benchmark for the continual activity recognition challenge we have created a Continual Activity Recognition (CAR) dataset, derived from a fraction of the MEVA (Multiview Extended Video with Activities) activity detection dataset (https://mevadata.org/). We selected a suitable set of 8 activity classes from the original list of 37, and annotated each frame in 15 video sequences, each composed by 3 clips originally from MEVA, with a single class label.

Our CAR benchmark is composed by 15 sequences, broken down into three groups:

  • Five 15-minute-long sequences formed by three original videos which are contiguous

  • Five 15-minute-long sequences formed by three videos separated by a short time interval (5-20 minutes)

  • Five 15-minute-long sequences formed by three original videos separated by a long interval of time (hours or even days)

Each of these three evaluation settings is designed to simulate a different mix of continuous and discrete domain dynamics.

The ground truth for the CAR challenges (in the form of an activity label per frame) was created by us, after selecting a subset of 8 activity classes and revising the original annotation for the 45 video clips we selected for inclusion.

The dataset can be downloaded from the github provided for the baseline at:

https://github.com/salmank255/IJCAI-2021-Continual-Activity-Recognition-Challenge

Continual Crowd Counting (CCC)

Our CCC benchmark is composed by 3 sequences, taken from existing crowd counting datasets:

  • A single 2,000 frame sequence from the Mall dataset

  • A single 2,000-frame sequence from the UCSD dataset

  • A 750-frame sequence from the Fudan-ShanghaiTech (FDST) dataset composed by 5 clips portraying the same scene, each 150 frames long.

The ground truth for the CCC challenges (in the form of a density map for each frame) was generated by us for all three datasets following the annotation protocol described in https://github.com/svishwa/crowdcount-mcnn

The dataset can be downloaded from the github provided for the baseline at:

https://github.com/Ajmal70/IJCAI_2021_Continual_Crowd_Counting_Challenge