Task A: "Multi-label emotion classification in Urdu"

EmoThreat: Emotions & Threat Detection in Urdu @FIRE 2022

Task Description

Multi-label emotion detection in the text has a lot of significance in both research and industry for multiple applications of artificial intelligence. Social media text can evoke multiple emotions in a small chunk of text, while there is a possibility that text could be emotionless or neutral, making it a challenging problem to tackle.

Urdu is spoken by more than 230 million people worldwide as a first and second language including India, Pakistan and Nepal. Needless to say that Urdu is also widely used on social media using the right to left Nastalīq script. Therefore, a multi-label emotion dataset for Urdu was long due and needed for understanding public emotions, especially applicable in natural language applications in disaster management, public policy, commerce, and public health.

We created a Nastalīq Urdu script dataset for multi-label emotion classification consisting of Twitter tweets using Ekman's six basic emotions and neutrality. The task requires you to classify the tweet as one, or more of the six basic emotions which is the best representation of the emotion of the person tweeting.

The task requires you to classify the tweet as one, or more of the six basic emotions (plus neutral) which is the best representation of the emotion of the person tweeting.

  • Anger: also includes annoyance and rage can be categorized as a response to a deliberate attempt of anticipated danger, hurt or incitement.

  • Disgust: in the text is an inherent response of dis-likeness, loathing or rejection to contagiousness.

  • Fear: also including anxiety, panic and horror is an emotion in a text which can be seen triggered through a potential cumbersome situation or danger.

  • Sadness: also including pensiveness and grief is triggered through hardship, anguish, feeling of loss, and helplessness.

  • Surprise: also including distraction and amazement is an emotion which is prompted by an unexpected occurrence.

  • Happiness: also including contentment, pride, gratitude and joy is an emotion which is seen as a response to well-being, a sense of achievement, satisfaction, and pleasure.

  • Neutral: is a tweet that does not evoke any emotion.

References

  1. Ashraf, N., Khan, L., Butt, S., Chang, H. T., Sidorov, G., & Gelbukh, A. (2022). Multi-label emotion classification of Urdu tweets. PeerJ Computer Science, 8, e896.


Related work

  1. Amjad, Maaz, et al. "Overview of Abusive and Threatening Language Detection in Urdu at FIRE 2021.”. CEUR Workshop Proceedings.(2021). CEUR Workshop Proceedings. 2021.

  2. Amjad, Maaz, et al. "UrduThreat@ FIRE2021: Shared Track on Abusive Threat Identification in Urdu." Forum for Information Retrieval Evaluation. 2021.

  3. Khan, Lal, et al. "Urdu sentiment analysis with deep learning methods." IEEE Access 9 (2021): 97803-97812.

  4. Amjad, Maaz, et al. "Threatening Language Detection and Target Identification in Urdu Tweets." IEEE Access 9 (2021): 128302-128313.

  5. Ashraf, Noman, et al. "YouTube based religious hate speech and extremism detection dataset with machine learning baselines." Journal of Intelligent & Fuzzy Systems Preprint: 1-9.

  6. Ameer, Iqra, et al. "Multi-label emotion classification using content-based features in Twitter." Computación y Sistemas 24.3 (2020): 1159-1164.

  7. Ameer, Iqra, et al. "Multi-Label Emotion Classification on Code-Mixed Text: Data and Methods." IEEE Access 10 (2022): 8779-8789.

  8. Khan, Lal, et al. "Multi-class sentiment analysis of urdu text using multilingual BERT." Scientific Reports 12.1 (2022): 1-17.

  9. Ameer, Iqra, et al. "Multi-label emotion classification in texts using transfer learning." Expert Systems with Applications (2022): 118534.