Datasets

As one of the goals of the project, we are publicly releasing to the community a novel collection of video data to be used for research purposes. The second release of the dataset is freely available here, denoted as PREMIER Dataset v3 and described below. More information can be found within the download folder.

PREMIER Dataset v3

Structure

The PREMIER dataset consists of families of native and altered data. The collection is composed of different subsets, each one marked by the suffix -Nx (for native data) and by the suffix -Ax (for altered data). This structure allows having consistent data but also the necessary flexibility to expand and represent future technologies. Altered data include videos that underwent signal manipulations through well-known software, as well as sharing operations (even multiple times) through social media platforms.

The contents of the PREMIER-Dataset folder is:

Dataset contains the image/video collection.
Statistics contains all available CSV files.
Code contains scripts used the dataset creation.
README.md

PREMIER Subsets

Experimental

PREMIER-A1 (5600 videos) - includes videos edited via Avidemux, Ffmpeg, Kdenlive, and Adobe Premiere. Furthermore, each edited video is also shared through YouTube, Facebook, Weibo, and TikTok.
PREMIER-A2 (160 videos) - includes videos single and double shared through Facebook and Youtube.
PREMIER-A3 (400 videos) - includes videos shared through Facebook, Instagram, Telegram, Twitter, and YouTube.
PREMIER-A4 (558 videos) - includes native and shared videos edited with ffmpeg.
PREMIER-A5 (1220 videos) - includes real and GAN-synthesized street videos in three compression levels (RAW, HQ, LQ).
PREMIER-A6 (2400 images) - includes synthetic images of human faces and military vehicles generated through several deep learning approaches.
PREMIER-A7 (79120 tracks) - includes audio tracks generated using 12 different Text-to-Speech (TTS) algorithms.
PREMIER-N1 (26 videos, 987 images) - includes native videos of flat/indoor/outdoor scenery and flat/natural images at original resolution.
PREMIER-N2 (58 videos, 352 images) - includes native images of flat and natural scenes, when available also RAW and HEIC formats are included. The video collection contains flat, indoor and outdoor scenery with and without movement. Some additional videos are provided to evaluate H265/HEVC codec and other non-default resolutions.
PREMIER-N3 (1831 videos, 6637 images) - includes native images of flat and natural scenes, when available also H.264//JPEG and H.265/HEIC formats are included. The video collection contains outdoor scenery with and without movement.

License

Copyright © CSP Lab (Communications & Signal Processing Laboratory), Dept Information Engineering, University of Florence, 2024.This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

Acknowledgment

The collection of the PREMIER Dataset was supported in part by the Italian Ministry of Education, Universities and Research MIUR under Grant 2017Z595XS, and in part by DARPA under Grant FA8750-16-2-0188.

Page updated

Report abuse