In the past few years, affective computing has become a prominent area in Natural Language Processing. While early studies focused on predicting the semantic polarity (sentiment analysis) of a text (Basile et al, 2014), due to the astounding progresses of machine learning, research moved towards more fine-grained modelling of affective states. To make just a few examples in EVALITA recent history: aspect-based sentiment analysis, which aims to detect polarity classes in relation to multiple aspects of an opinion item (Basile et al., 2018); automatic classification of specific communicative intentions, such as irony and sarcasm (Cignarella et al., 2018); stance detection, expected to determine the orientation of a writer in favour or against certain topics of interest (Cignarella et al, 2020).
So far, emotion analysis of Italian texts has not received the same amount of attention as sentiment analysis. Using an influential affective state taxonomy from Scherer (2000), we define emotion as a “relative brief episode of response to the evaluation of an external or internal event as being of major significance”, whereas sentiment corresponds to Scherer’s “attitude”: “relatively enduring, affectively colored beliefs, preferences, and predispositions toward objects or persons”.
What is true for emotion analysis is even truer about emotion dimensional analysis, which is based on dimensional models of emotions rather than categorical ones. While the latter describe affective states with limited sets of discrete emotions, such as those defined by Ekman or Plutchik, the former make use of a continuous scale spanning through three independent dimensions: Valence (degree of pleasure), Arousal (degree of excitement) and Dominance (degree of control over the situation).
The VAD model has received increasing attention in NLP tasks in English. Nevertheless, while a limited number of Italian corpora annotated with categorical models are available, to the best of our knowledge, no VAD annotated dataset exists for the Italian language.
With EmoITA we propose the first corpus of Italian sentences annotated with their VAD values. EmoITA is the Italian version of EMOBANK (Buechel and Hahn, 2017), the largest English VAD dataset to date. Sentences from EMOBANK have been translated in Italian and revised by a group of Italian researchers affiliated with Urban/Eco and the Humanities Department of the University of Catania. The same researchers also annotated each sentence with new VAD values.
With the proposal of these tasks, we intend to promote dimensional emotion analysis in Italian language, and to encourage a comparison with the English counterpart; the multidimensional subtasks, moreover, have the purpose of evaluating the possibility of a correlation between two the three dimensions of the model, which has been discussed but never proved in literature.
P. Basile, V. Basile, D. Croce, M. Polignano, Overview of the EVALITA 2018 Aspect-based Sentiment Analysis Task (ABSITA), in EVALITA@CLiC-it, 2018.
V. Basile, A. Bolioli, M. Nissim, V. Patti, P. Rosso, Overview of the Evalita 2014 SENTIment POLarity Classification Task, in Proceedings of EVALITA 2014, Pisa, Pisa University Press , 2014, pp. 50–57.
S. Buechel, U. Hahn, EmoBank: Studying the Impact of Annotation Perspective and Representation Format on Dimensional Emotion Analysis, in M. Lapata, P. Blunsom, A. Koller (Eds), Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Association for Computational Linguistics, 2017, pp. 578–585.
A.T. Cignarella, S. Frenda, V. Basile, C. Bosco, V. Patti, P. Rosso, Overview of the EVALITA 2018 Task on Irony Detection in Italian Tweets (IronITA), in EVALITA@CLiC-it, 2018.
A.T. Cignarella, M. Lai, C. Bosco, V. Patti, P. Rosso, SardiStance @ EVALITA2020: Overview of the Task on Stance Detection in Italian Tweets, in EVALITA, 2020.
K.R. Scherer, Psychological models of emotion, in J.C. Borod (Ed), The neuropsychology of emotion, Oxford, Oxford University Press, 2000, pp. 137-162.