Cognitive Science has shown that humans consistently segment videos into meaningful chunks. The segmentation happens naturally, without pre-defined categories and without being explicitly asked to do so.
Here, we study the task of Generic Event Boundary Detection (GEBD), aiming at detecting generic, taxonomy-free event boundaries that segment a whole video into chunks. Details can be found in our paper: https://arxiv.org/abs/2101.10511
Some example event boundaries are shown in the righthand figure.
Generic event boundaries are
immediately useful for applications like video editing and summarization
stepping stone towards long-form video modeling via reasoning the temporal structures of segmented units
We present more details of our dataset & annotation below and present details of competition track 1 and track 2 in the corresponding separate webpages. More details & some visualization examples can be found in our white paper.
Notably, our Kinetics-GEBD has the largest number of boundaries (e.g. 32x of ActivityNet, 8x of EPIC-Kitchens-100) which are in-the-wild, open-vocabulary, cover generic event change, and respect human perception diversity.
Examples of generic event boundaries: 1) A long jump is segmented at a shot cut, then between actions of Run, Jump and Stand up (dominant subject in red circle). 2) color/brightness changes. 3) new subject appears.
Dataset & Annotation Overview
We repeat these cognitive experiments on the following mainstream CV datasets; with our novel annotation guideline which addresses the complexities of taxonomy-free event boundary annotation.
- Kinetics-GEBD
Our Kinetics-GEBD Train Set contains 20K videos randomly selected from Kinetics-400 Train Set. Our Kinetics-GEBD Test Set contains another 20K videos randomly selected from Kinetics-400 Train Set. Kinetics-GEBD Val Set contains all 20K videos in Kinetics-400 Val Set.
The Kinetics-400 Dataset can be downloaded from here.
The Kinetics-GEBD annotations (Train Set/Val Set) can be downloaded from here.
Video list for Kinetics-GEBD Test Set can be found here.
Note that some of the videos in Kinetics-GEBD Train Set and Val Set are no longer available but all test videos are available as of Mar 2021.
- HMDB
We provide event boundary annotations on HMDB, a human motion database. This dataset contains 6849 clips divided into 51 action categories, each containing a minimum of 101 clips.
The HMDB Dataset can be downloaded here.
The HMDB-GEBD annotations can be downloaded here.
- ADL
We provide scene boundary annotations on ADL, a dataset of one million frames of dozens of people performing unscripted, everyday activities.
The ADL Dataset can be downloaded here.
The ADL-GEBD annotations can be downloaded here.
- UT-Ego
We provide scene boundary annotations on UT-Ego, a dataset of long-form egocentric videos captured from head-mounted cameras.
The UT-Ego Dataset can be downloaded here.
The UT-Ego-GEBD annotations can be downloaded here.
Communication & QA
For Challenge Policies: CodaLab Challenge Forum
For starter baseline code: Github