Although authors may use any datasets they choose in their submissions, the organizers are also providing 3 datasets to support generic video search and specific instance search research work. All the provided datasets has or are being used at the annual TRECVID (video retrieval evaluation) and VBS (video browser showdown) benchmarks.
The V3C1 dataset (drawn from a larger V3C video dataset) is composed of 7475 Vimeo videos (1.3 TB, 1000 h) with Creative Commons licenses and mean duration of 8 min. All videos will have some metadata available e.g., title, keywords, and description in json files. The dataset has been segmented into 1,082,657 short video segments according to the provided master shot boundary files. In addition, Keyframes and thumbnails per video segment have been extracted and available.
The IACC.3 dataset is approximately 4600 Internet Archive videos (144 GB, 600 h) with Creative Commons licenses in MPEG-4/H.264 format with duration ranging from 6.5 min to 9.5 min and a mean duration of almost 7.8 min. Most videos will have some metadata provided by the donor available e.g., title, keywords, and description.
The BBC Eastenders dataset is approximately 244 video files (totally 300 GB, 464 h) with associated metadata, each containing a week's worth of BBC EastEnders programs in MPEG-4/H.264 format.