• Cognitive Science has shown that humans consistently segment videos into meaningful chunks. The segmentation happens naturally, without pre-defined categories and without being explicitly asked to do so.

  • Here, we study the task of Generic Event Boundary Detection (GEBD), aiming at detecting generic, taxonomy-free event boundaries that segment a whole video into chunks. Details can be found in our paper: https://arxiv.org/abs/2101.10511

  • Some example event boundaries are shown in the righthand figure.

  • Generic event boundaries are

    • immediately useful for applications like video editing and summarization

    • stepping stone towards long-form video modeling via reasoning the temporal structures of segmented units


  • We present more details of our dataset & annotation below and present details of competition track 1 and track 2 in the corresponding separate webpages. More details & some visualization examples can be found in our white paper.

  • Notably, our Kinetics-GEBD has the largest number of boundaries (e.g. 32x of ActivityNet, 8x of EPIC-Kitchens-100) which are in-the-wild, open-vocabulary, cover generic event change, and respect human perception diversity.

Examples of generic event boundaries: 1) A long jump is segmented at a shot cut, then between actions of Run, Jump and Stand up (dominant subject in red circle). 2) color/brightness changes. 3) new subject appears.


Dataset & Annotation Overview

We repeat these cognitive experiments on the following mainstream CV datasets; with our novel annotation guideline which addresses the complexities of taxonomy-free event boundary annotation.

  1. Kinetics-GEBD

    • Our Kinetics-GEBD Train Set contains 20K videos randomly selected from Kinetics-400 Train Set. Our Kinetics-GEBD Test Set contains another 20K videos randomly selected from Kinetics-400 Train Set. Kinetics-GEBD Val Set contains all 20K videos in Kinetics-400 Val Set.

    • The Kinetics-400 Dataset can be downloaded from here.

    • The Kinetics-GEBD annotations (Train Set/Val Set) can be downloaded from here.

    • Video list for Kinetics-GEBD Test Set can be found here.

    • Note that some of the videos in Kinetics-GEBD Train Set and Val Set are no longer available but all test videos are available as of Mar 2021.


  1. HMDB

  • We provide event boundary annotations on HMDB, a human motion database. This dataset contains 6849 clips divided into 51 action categories, each containing a minimum of 101 clips.

  • The HMDB Dataset can be downloaded here.

  • The HMDB-GEBD annotations can be downloaded here.


  1. ADL

  • We provide scene boundary annotations on ADL, a dataset of one million frames of dozens of people performing unscripted, everyday activities.

  • The ADL Dataset can be downloaded here.

  • The ADL-GEBD annotations can be downloaded here.


  1. UT-Ego

  • We provide scene boundary annotations on UT-Ego, a dataset of long-form egocentric videos captured from head-mounted cameras.

  • The UT-Ego Dataset can be downloaded here.

  • The UT-Ego-GEBD annotations can be downloaded here.


Communication & QA