Task Description
People's opinions are valuable for both individuals and organizations, whether they're public or private. In studies centered on Portuguese, the main focus is usually on analyzing sentiment at the document level. However, it's challenging to find methods or datasets specifically designed for Aspect-Based Sentiment Analysis (ABSA) in Portuguese.
To address this gap, we propose to create an Aspect-Based Sentiment Analysis for TripAdvisor reviews written in Portuguese. Two sub-tasks will be available: Aspect Term Extraction and Sentiment Orientation Extraction. The first task focuses on identifying the aspects discussed within the reviews, while the latter task aims to determine the sentiment (either positive, negative, or neutral) expressed toward each mentioned aspect.
Due to the limited availability of Portuguese text collections, this project has the potential to significantly enhance our understanding of how opinions are expressed in Portuguese. We draw inspiration from similar competitions for other languages such as SemEval [3, 4, 5] and EVALITA [6]. Furthermore, we hope that this initiative will empower researchers and developers to create more effective tools for analyzing opinions in Portuguese.
Corpora
The dataset consists of reviews from travelers about accommodation services companies, written in Portuguese. For this task, we utilized datasets previously developed by Freitas [1] and Corrêa [2]. Freitas' corpus is publicly accessible and will be exclusively used for training. Corrêa's corpus, however, is private and will be divided into training and test sets. The complete dataset will be made available after the event. Both datasets were annotated using the same guidelines [1]. The annotated aspects correspond to concepts outlined in the Accommodation Services Domain Ontology, HOntology [7].
Evaluation Metrics
Each participating team will receive training and test datasets that have been manually annotated. Submissions for the test dataset will undergo evaluation based on various metrics, including Accuracy, Precision, Recall, F1-Score, and Balanced Accuracy (Bacc). The submissions will be ranked according to Accuracy for Task 1 and Bacc for Task 2.
Target Audience
The intended audience includes anyone interested in Aspect-Based Sentiment Analysis. We hope for substantial engagement of academics, researchers, students, industrial teams, and practitioners of private companies.
References
L. A. de Freitas. Feature-level sentiment analysis applied to Brazilian Portuguese reviews. PhD thesis, Pontifícia Universidade Católica do Rio Grande do Sul (2015).
U. B. Corrêa. Análise de sentimento baseada em aspectos usando aprendizado profundo: uma proposta aplicada à língua portuguesa. PhD thesis, Universidade Federal de Pelotas (2021).
M. Pontiki and D. Galanis and J. Pavlopoulos and H. Papageorgiou and I. Androutsopoulos and S. Manandhar. SemEval-2014 Task 4: Aspect Based Sentiment Analysis. In Proceedings of the 8th International Workshop on Semantic Evaluations (SemEval-2014), pages 27–35, Dublin, Ireland, 2014. Association for Computational Linguistics.
M. Pontiki and D. Galanis and H. Papageorgiou and S. Manandhar and I. Androutsopoulos. SemEval-2015 Task 12: Aspect Based Sentiment Analysis. In Proceedings of the 9th International Workshop on Semantic Evaluations (SemEval-2015), pages 486-495, Denver, Colorado, USA, 2015. Association for Computational Linguistics.
M. Pontiki and D. Galanis and H. Papageorgiou and I. Androutsopoulos and S. Manandhar and M. AL-Smadi and M. Al-Ayyoub and Y. Zhao and B. Qin and O. De Clercq and V. Hoste and M. Apidianaki and X. Tannier and N. Loukachevitch and E. Kotelnikov and N. Bel and S. M. Jiménez-Zafra and G. Eryigit. SemEval-2016 Task 5: Aspect Based Sentiment Analysis. In Proceedings of the 10th International Workshop on Semantic Evaluations (SemEval-2016), pages 19–30, San Diego, California, USA, 2016. Association for Computational Linguistics.
L. De Mattei and G. De Martino and A. Lovine and A. Miaschi and M. Polignano and G. Rambelli. Overview of the Aspect Term Extraction and Aspect-based Sentiment Analysis Task. In Proceedings of the 7th Evaluation Campaign of Natural Language Processing and Speech tools for Italian (EVALITA 2020), Online. CEUR.org.
M. S. Chaves and L. A. de Freitas and R. Vieira. HOntology: a multilingual ontology for the accommodation sector in the tourism industry. In Proceedings of the 4th International Conference on Knowledge Engineering and Ontology Development (KEOD 2012), pages 149-154, Barcelona, Spain, 2012.