Dataset
We provide a comic scenes dataset which is composed of comic images from the comic books COMICS public dataset [1]. The COMICS dataset includes over 1.2 million scenes (120 GB) paired with automatic textbox transcriptions (the transcriptions are done by Google Vision OCR 2 , containing recognition errors).
The ground-truth annotations of a sub-dataset extracted from the COMICS dataset, containing multiple sentiment classes for each scene will be provided.
There are 8 emotion classes including:
0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral, 7=Others.
Annotation format
Image_id, label1, label2, label3, label4, label5, label6, label7, label8
SenRegCom0001, 0, 1, 0, 1, 0, 1, 0, 1
SenRegCom0002, 0, 1, 0, 1, 0, 1, 1, 0
etc.
Download links
Please check our CodaLabs
References
[1] Iyyer, Mohit, et al. " The amazing mysteries of the gutter: Drawing inferences between panels in comic book narratives ." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, . 2017.