PMEmo: A Dataset For Music Emotion Computing
The PMEmo dataset [1] was initially comprised of 794 music chorus clips from several western popular music genres, using continuous emotional annotations at a 2 Hz temporal resolution and static (clip-level) emotional annotations varying from -1 to 1 in the valence and arousal scales and the simultaneous electrodermal activity (EDA) signals.
We extended the PMEmo dataset [1] by including the corresponding official music videos from YouTube channels. We must use the official music videos here because their music and video content are always highly synchronized and composed to convey the same emotions as the original songs.
After adapting the dataset, we created the database: the PMEmo Expansion dataset (PMEmoExp), which contains 521 music video clips with an average length of 38 seconds. The remaining 273 songs were discarded because they did not have official music videos.
The training set included 453 excerpts (total duration of 4 hours, 42 minutes, and 56 seconds), and the test set included 68 excerpts (total duration of 53 minutes and 37 seconds).
[1] Kejun Zhang, Hui Zhang, Simeng Li, Changyuan Yang, and Lingyun Sun. The PMEmo Dataset for Music Emotion Recognition. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval (ICMR '18).