Sentiment Analysis

Sentiment Analysis in Low-resource Language

Summary: Sentiment analysis often faces challenges in languages with limited labeled data or tools, known as low-resource languages. This project investigates various methods to enhance sentiment classification in low-resource Bengali language by leveraging  resources from the English. It includes diverse taks such as developing sentiment lexicons, annotating corpora, and integrating English resources like labeled data and tools through machine translation-based language mapping.

Self-Supervised Sentiment Classification

Summary: This project aims to enhance sentiment classification from unlabeled data using a self-supervised hybrid approach. It combines an ML classifier with a lexicon-based method to predict semantic orientation and confidence scores for reviews. By utilizing these confidence scores, we generate precise pseudo-labels, integrating a supervised ML algorithm to improve sentiment classification, particularly for nuanced reviews. The proposed approach shows significant improvements in macro F1 scores for both binary and 3-class sentiment classification . This indicates that in data-scarce domains, the proposed approach can enhance the performance of sentiment classification.

Analyzing Neutral Review

Summary: This project focuses on understanding the characteristics of neutral/mixed review texts. It reveals challenges associated with them, such as rating inconsistencies for neutral reviews. Additionally, it examines the limitations of lexicon-based methods for accurately determining neutral reviews. These challenges arise from user preferences for specific aspects, sentiment lexicon coverage, irregularities in aggregation rules, and the context-sensitive nature of word-level polarity. Moreover, the project aims to distinguish linguistic signals in both neutral and strongly opinionated texts, leveraging them to enhance the performance of lexicon-based methods.

Relevant Publications:

[J1] Sazzed, S., BengSentiLex and BengSwearLex: creating lexicons for sentiment analysis and profanity detection in low-resource Bengali language, In PeerJ Computer Science, 2022.

Impact Factor: 3.8 

[C4] Sazzed, S., Cross-lingual sentiment classification in low-resource Bengali language, In WNUT@Empirical Methods in Natural Language Processing (EMNLP), 2021.

[C3] Sazzed, S., Development of sentiment lexicon in bengali utilizing corpus and cross-lingual resources, In Information Reuse and Integration for Data Science (IRI), 2020.

[C2] Sazzed, S., A sentiment classification in bengali and machine translated english corpus, In Information Reuse and Integration for Data Science (IRI), 2021.

[C1] Sazzed, S., Improving sentiment classification in low-resource bengali language utilizing cross- lingual self-supervised learning, In International Conference on Natural Language & Information Systems (NLDB), 2021.

Relevant Publications:

[J1] Sazzed, S. & Jayarathna, S., SSentiA: A Self-supervised Sentiment Analyzer for Classification from Unlabeled Data., In Machine Learning with Applications, 2022.

Impact factor: new journal

[C2] Sazzed, S., A Hybrid Approach of Opinion Mining and Comparative Linguistic Analysis of Restaurant Reviews , In Recent Advances in Natural Language Processing (RANLP), 2021.

Relevant Publications:

[C2] Sazzed, S., Identifying neutral reviews from unlabeled data: An exploratory study on user ratings and word-level polarity scores, In ACM Conference on Hypertext and Social Media (ACM HT), 2022.

[C1] Sazzed, S., Understanding Linguistic Variations in Neutral and Strongly Opinionated Reviews., In International Conference on Machine Learning and Applications (ICMLA), 2022.