With the widespread adoption of social media and online forums, individual users have been able to actively participate in the generation of online content in different languages and dialects. As a result, user-generated content (UGC) has seen an enormous growth in the recent years. The nature of UGC means that it can be generated at any time and in non-standard language or formats. Compared to professionally edited text, it is often more noisy, and likely to take some liberty with commonly established grammar, punctuation and spelling norms. All this can make it difficult to translate but UGC can also be incredibly valuable. This workshop will explore the multifarious aspects of effective MT of data extracted from social media.
The workshop aims to provide a research platform dedicated to new method and techniques on translating user-generated content and exploring the use of such transition on social media analytics. The workshop will solicit original research contributions related to the theme, which includes (but is not limited to):
- Models and Tools Development for Social MT
- Machine translation on Microblogs
- Multi-lingual social analytics
- Neural MT for UGC translation
- Multilingual crowdsourcing
- Building resources for UGC translation
- Sentiment translation of UGC
- Analyzing the diffusion of multilingual information
- Using MT for monitoring emergency responses among social crowds
- Multilingual Social-based web platform for disaster management
- Multilingual and language-specific Information Retrieval on Social Web
- Crosslingual document alignment using UGC data
- Named entity transliteration on social media content
- Code-mixed UGC translation
- MT for Big social data analysis
Submissions may include work in progress as well as finished work. Submissions must have a clear focus on specific issues pertaining to UGC and its translation. Descriptions of commercial systems are welcome, but authors should be willing to discuss the details of their work.
Additionally, the workshop will aim to bring together researchers from diverse fields, such as Machine Translation, Big Data and Machine Learning, Natural Language Processing, and Computational Social Sciences, who can potentially contribute to improving the quality of UGC translation and its utilisation in research and industrial data analytics tasks.