About Natural Language Processing for Social Media (SocialNLP)

With social media services’ rise of popularity, including general-purpose Microblogs such as Facebook, Plurk and Twitter, goal-oriented services such as Linkedln (for professional occupation), Del.icio.us. (a social bookmarking service) and Foursquare (a check-in service for mobile devices), and Web 2.0-based large-scale knowledgebase such as Wikipedia and common-sense corpus, now researchers can assess heterogeneous information of the target human/object that includes not only text content but also meta-data, or even the social relationships among persons.

Furthermore, the content on social media and Web 2.0 platforms is different from that on others in terms of style, tone, purpose, etc. For instance, posts on twitter are limited in size, thus can contain jargons, emoticons, or abbreviations which usually do not follow formal grammar. It is not suitable to apply existing natural language techniques on such content because they are not tailored to do so. For instance, standard summarization techniques might not be suitable for Plurk posts that are relatively short and contain responses from multiple friends; and sentiment dictionaries learned from news corpus might not be suitable for sentiment detection tasks on Microblogs.

As it is generally believed social media has become one of the major means for communication and content producing, while such trend is not likely to fade away, being able to process content from social media platforms does bring a lot of values in real-world applications. Furthermore, due to the change of the style to the content and the availability of heterogeneous resources (e.g. social relationship among people) one can obtain, novel NLP techniques that are designed specifically for such platform and can potentially integrate or learn information from different sources are highly demanded. Below we highlight some (non-exclusive) important themes in this direction.


Topics of Interest


Content analysis on Social Media
  • Summarization of posts/replies on social media

  • Name entity Recognition on Social media

  • Relationship extraction on social media.

  • Entity resolution for social media

  • Search, Indexing, and Evaluation on Social Web

  • Improving Speech Recognition using Social Media Content

  • Multilingual and Language‐specific Information Retrieval on Social Web


Natural language processing on Web 2.0

  • Folksonomy and Social Tagging

  • Trend analysis on Wikipedia

  • Trustworthiness analysis on Wikipedia

  • Human computing for social-media corpus generation

  • Social structure and position analysis using Microblog content

  • Trust and Privacy analysis in social contexts

  • Community detection using blog or Microblog content


Sentiment and Opinion Analysis on Social Media

  • Lexical semantic resources, corpora and annotations of social media for sentiment analysis

  • Opinion retrieval, extraction, classification, tracking and summarization

  • Domain specific sentiment analysis and model adaptation

  • Emotion detection

  • Sentiment analysis for automatic public opinion poll and surveys of user satisfaction

  • Improvement of NLP tasks using subjectivity and/or sentiment analysis on social platform

  • Sentiment analysis and human computer interface on social platform

  • Real-world sentiment applications and systems on social platform


Models and Tools Development for SocialNLP

  • Social-network motivated methods or tools for natural language processing

  • Advanced topic model for social media

  • Learning to rank for social media

  • Clustering and Classification tools for Social Media

  • Content-based and social-based Recommendation

  • Multi-lingual machine translation on Microblog