Emoji Analysis:

Study on the Influence of Emoji Usages in Twitter Sentiment Analysis

Introduction

Existing twitter sentiment analyses more or less ignore the important role of emoji in expressing sentiments in natural language. In contemporary forms of online communications, it is likely to make a wrong interpretation of sentiment with the absence of emojis and/or emoticons.

For example, the following two sentences clearly represent two kind of emotions.

“I’m having a good day. 😘 ”

“I’m having a good day. 😑 ”

In this research project, we aim at exploring the influence factor of emojis in the sentiment of the original text by comparing the prediction results with or without emoji factors from different machine learning algorithms.

Dataset and Labeling

  • Public twitter dataset
  • Emojitracker: realtime emoji use on twitter <http://emojitracker.com/>
  • Emojipedia <https://emojipedia.org/>

Data Labeling

We have two sets of tweets for labeling. First one is a set of general tweets randomly selected from all the tweets in June, and second one a set of trump related data.

At first, we decided that all four members would label the same set of data based on our own interpretation of the sentimental feeling for a tweet. We would give scores of "-1", "0", and "1", which stands for negative, neutral, and positive respectively. The final label would be chosen by the one with the most votes.

For example, "Yhuuu a banking notification woke me up ❤️😊" is a randomly selected tweet. The "heart" and "smile face" indicate that the user is in a positive mood. Thus, I will give a score of "1". "OMG I NEED TO MEET IT😭😭" is another tweet that shows the negative sentiment, so I will label it as "-1".

However, this methodology has several problems. One problem occurs when we find there are a lot of meaningless tweets in the sense that they don’t have a strong sentimental polarity. "➖ Herbology➖🏰 St. Vladimir Academy📅 Friday⏰ Current Time👨 @Sir_AWhitmore / @xxnmj00🆔 M-26125" is an example where it is every hard for us to determine the sentimental polarity of the tweet.

The other problem occurs that all of us have subjective feelings, and we often have two positive and two negative labels for a single tweet, which makes finding the consensus difficult. "I've already met these two butt faces @Frrreyaa and @RedHorseRX and it was amazing ☹️💕 soon I'll hug @oKouhai and @AlluraWonders 😭❤️" from this tweet, whether the user is in a positive of a negative mood depends a lot on how individuals interpret the tweets. Two of the four members gave a positive label and the other two members gave a negative label. We have no way but to give up the tweet, which slows down our labeling process significantly.

To solve the above problems, we decided to only label Trump related tweets due to the fact that he is the celebrity everyone is talking about and people usually have pretty strong opinions on him.

Because labelling tweets manually is a time consuming work, we decided that each of us would label 300 hundred of different tweets for efficiency. For future improvement, we would find more native speakers to increase the quality of data labeling.

Design Choices and Parameter Selection

Our team decide to use a classic approach of bag-of-words representation in order to classify the sentiment of tweets. In the pilot study, we experimented word lists used in prior studies as the feature space. However, since we decided to study tweets about president Trump, which is a very specific topic, it is more reasonable to focus on words and emoji that occur in these trump-related tweets.

We implemented four different classifiers to learn and predict the sentiment label—a Naïve Bayes model, a decision tree model, a multi-layer perceptron model and a kernel SVM with RBF kernel. Parameters are selected by experimenting the highest cross-validation accuracy during training.

Accuracy Results

Here we provide the test results using different combinations of counts for plain words and emoji.

The testing accuracies of all four classifiers are scattered around 50%, and there is no clear sign of improvements when introducing emoji into our dictionary. In addition, as we increased the word count for plain words significantly, both Naïve Bayes and decision tree model suffered from overfitting, as the training accuracy reached up to 90% while the testing rate did not improve accordingly.

Conclusion

So, what have we learned from these results? Speaking generally, they essentially suggest the lack of effectiveness of the bag of words representation approach. There are a couple of reasons for this. First off, tweets are short and only up to 140 characters, but our dictionary as the feature space is much larger. The testing accuracies appear to be invariant of our dictionary selection, which is also an important sign that the bag of words model did not work out on our problem, or at least did not work out on our dataset. Last but not least, the multi-layer perceptron and SVM models we used in our experiments are very computationally inefficient. Thus unfortunately, our study did not provide enough evidence to prove a significant impact of emoji as part of the input features.

Despite the testing results data which is a bit frustrating, we still believe there are great incentives in using emoji as part of sentiment analysis. It is a very expensive task using human labor to classify thousands of tweet sentiments. In terms of possible improvements in the future, we suggest using a larger dataset, meaning more labeled tweets and tweets with larger word count. We also recommend experimenting with different topics, or using different categorization criteria when hand-labeling tweets.

Thanks for watching our video! Hope this could provide some thoughts on utilizing emoji for twitter sentiment analysis. Feel free to visit our GitHub page to experiment our code and stay tuned for future updates.

Reference

■ [1] Novak, Petra Kralj, et al. "Sentiment of emojis." PloS one 10.12 (2015): e0144296.

■ [2] Dimson T. Emojineering part 1: Machine learning for emoji trends [blog]; 2015. http://instagramengineering.tumblr.com/post/117889701472/emojineering-part-1-machine-learning-for-emoji/.

■ [3] Nakov, Preslav, et al. "Semeval-2013 task 2: Sentiment analysis in Twitter." Second Joint Conference on Lexical and Computational Semantics (* SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). Vol. 2. 2013.

■ [4] Mohammad, Saif M., Svetlana Kiritchenko, and Xiaodan Zhu. "NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets." arXiv preprint arXiv:1308.6242 (2013).

■ [5] Kalchbrenner, Nal, Edward Grefenstette, and Phil Blunsom. "A convolutional neural network for modelling sentences." arXiv preprint arXiv:1404.2188 (2014).

■ [6] Severyn, Aliaksei, and Alessandro Moschitti. "Twitter sentiment analysis with deep convolutional neural networks." Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2015.

■ [7] Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014).

■ [8] Rosenthal, Sara, Noura Farra, and Preslav Nakov. "SemEval-2017 task 4: Sentiment analysis in Twitter." Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). 2017.

■ [9] Nakov, Preslav, et al. "SemEval-2016 Task 4: Sentiment Analysis in Twitter." SemEval@ NAACL-HLT. 2016.

■ [10] Go, Alec, Richa Bhayani, and Lei Huang. "Twitter sentiment classification using distant supervision." CS224N Project Report, Stanford 1.2009 (2009): 12.

■ [11] Davidov, Dmitry, Oren Tsur, and Ari Rappoport. "Enhanced sentiment learning using twitter hashtags and smileys." Proceedings of the 23rd international conference on computational linguistics: posters. Association for Computational Linguistics, 2010.

■ [12]P. Kralj Novak, J. Smailovi, B. Sluban, and I. Mozeti, “Sentiment of emojis,” PLOS ONE, vol. 10, no. 12, pp. 1–22, 12 2015. [Online]. Available:https://doi.org/10.1371/journal.pone.0144296

We use some open source code that benefits our project. More details can be found in the link below:

https://github.com/fnielsen/afinn

Part of the code we use for this project is already implemented for ECE x554 Machine Learning Homework under the instruction of Prof. Bert Huang.