In recent years, the exploitation of the potential of big data has resulted in significant advancements in a variety of Computer Vision and Natural Language Processing applications. However, the majority of tasks addressed thus far have been primarily visual in nature due to the unbalanced availability of labelled samples across modalities (e.g., there are numerous large labelled datasets for images but few for audio or IMU-based classification), resulting in a large performance gap when algorithms are trained separately. With its origins in audio-visual speech recognition and, more recently, in language and vision projects such as image and video captioning, multimodal machine learning is a thriving multidisciplinary research field that addresses several of artificial intelligence's (AI) original goals by integrating and modelling multiple communicative modalities, including linguistic, acoustic, and visual messages. Due to the variability of the data and the frequently observed dependency between modalities, this study subject presents some particular problems for machine learning researchers. Because the majority of this hateful content is in regional languages, they easily slip past online surveillance algorithms that are designed to target articles written in resource-rich languages like English. As a result, low-resource regional languages in Asia, Africa, Europe, and South America face a shortage of tools, benchmark datasets, and machine learning approaches.
This workshop aims to bring together members of the machine learning and multimodal data fusion fields in regional languages. We anticipate contributions that hate speech and emotional analysis in multimodality include video, audio, text, drawings, and synthetic material in regional language. This workshop's objective is to advance scientific study in the broad field of multimodal interaction, techniques, and systems, emphasising important trends and difficulties in regional languages, with a goal of developing a roadmap for future research and commercial success.
We invite submissions on topics that include, but are not limited to, the following:
Multimodal Sentiment Analysis in regional languages
Hate content video detection in regional languages
Trolling and Offensive post detection in Memes
Multimodal data fusion and data representation for hate speech detection in regional language
Multimodal hate speech benchmark datasets and evaluations in regional languages
Multimodal fake news in regional languages
Data collection and annotation methodologies for safer social media in low-resourced languages
Content moderation strategies in regional languages
Cybersecurity and social media in regional languages
Keynote Title: Ideals and Compromises in Multilingual Multimodal Learning
Abstract: I will discuss the importance of working with high-quality datasets in multilingual multimodal learning. I will start by describing MaRVL, a dataset for which requires models to understand diverse visually-grounded concepts that reflect the cultural backgrounds of five diverse languages. Then, I will show how we repurposed existing multimodal datasets to create the IGLUE benchmark, which places less emphasis on diverse concepts but it allows researchers to evaluate their systems on four multimodal tasks across twenty languages. Finally, I will show that we can substantially improve zero-shot cross-lingual transfer by pretraining with machine translated text, which raises the question: how will we know if we are making meaningful progress?
Bio: Desmond is an Assistant Professor at the University of Copenhagen where he builds and attempts to understand multimodal and multilingual models. His work received the Best Long Paper Award at EMNLP 2021 and an Area Chair Favourite paper at COLING 2018. He co-organised the Multimodal Machine Translation Shared Task from 2016–2018, the 2018 Frederick Jelinek Memorial Workshop on Grounded Sequence-to-Sequence Learning, the How2 Challenge Workshop at ICML 2019, and the Workshop on Multilingual Multimodal Learning at ACL 2022.
Shyamala Doraisamy is an Associate Professor at the Faculty of Computer Science and Information Technology, University Putra Malaysia (UPM). She received her PhD from Imperial College London in 2004 specializing in the Music Information Retrieval field. In 2007, she won an award for her work in the interdisciplinary fields of computing and music at the Invention and New Product Exposition (INPEX), Pittsburgh, USA. She currently leads the Digital Information Computation and Retrieval (DICR) research group at UPM and has completed several projects on music and health applications. Her research interests include Multimedia Information Processing, Multimodal Machine Learning and Machine Listening. She is a member of the Malaysian Society of Information Retrieval and Knowledge Management (PECAMP) and was the General Chair of the IEEE 2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP’18). She is currently an Honorary Research Fellow at the School of Computer Science, University of Lincoln (UoL), United Kingdom as UPM’s partner lead for large-scale research collaborations under the European Union’s Horizon 2020 Marie Sklodowska-Curie Actions (MSCA) -RISE (Research and Innovation Staff Exchange) programme, with UoL as the consortium coordinator.
Title: Multimodal Interactions for Speech and Language Applications
Abstract:
I will discuss multimodal interactions with speech processing applications -- from linking tunes for prosody to viseme charts for lip-sync. I will show some of my early projects on developing voice activated mobile applications, and animated newsreaders with speech to text synchronization and lip-sync. I will then discuss related technologies towards the development of these multimodal interactions. Finally, I will provide a brief overview of several speech and language project collaborations I have been involved for the Malay language
Organizers Profile
Dr. Bharathi Raja Chakravarthi, (Lead workshop organiser), National University of Ireland Galway, Ireland
Assistant Professor at the School of Computer Science, University of Galway, Ireland. Dr. Chakravarthi created resources for under-resourced languages, which are published in
Machine Translation Summit and LREC 2020-associated events. He organized multiple workshops in the European Association of Computation Linguistics (EACL 2021) and the Association of Computation Linguistics (ACL 2022). He also organized many shared tasks for regional languages to create more multimodal resources and language technologies for regional languages. He has published papers on multimodal machine learning in international journals. He has also edited special issues in Springer and Elsevier journals. He serves as a Reviewer for various SCI journals. He is a member of IEEE, ACM, and the Association of Computation Linguistics (ACL).
Email: bharathiraja.asokachakravarthi@universityofgalway.ie and bharathi.raja@insight-centre.org
Google Scholar: https://scholar.google.com/citations?user=irCl028AAAAJ&hl=en
Dr. Abirami Murugappan, Anna University, Chennai
Assistant Professor at the DEPARTMENT OF INFORMATION SCIENCE AND TECHNOLOGY Anna University, Chennai. Area Of Expertise Image Processing, Natural Language Processing, Artificial Intelligence, Video Analytics, Data Analytics
Email: abirami@auist.net
Google Scholar: https://scholar.google.co.in/citations?user=ojFgyLAAAAAJ
Dr. Dhivya Chinnappa, Thomson Reuters, USA
Dhivya Chinnappa is a Research Scientist at Thomson Reuters. Her research focuses on Natural Language Processing and its applied studies. Prior to working at Thomson Reuters, she was a PhD candidate supervised by Dr. Eduard Blanco from the University of North Texas. She was a member of the Human Intelligence and Language Technologies laboratory at the Department of Computer Science and Computer Engineering. She has published papers in multiple ACL venues. She has also served as a guest editor for the special issue on Speech and Language Technologies for Dravidian Languages, in Elsevier’s Computer Speech & Language. She was also the co-organizer for one of the FIRE 2021 shared task.
Email: dhivya.infant@gmail.com
Google Scholar: https://scholar.google.com/citations?user=rTO6XDkAAAAJ&hl=en
Adeep Hande, Indiana University Bloomington, IN, USA
A Master’s student in Data Science at Indiana University Bloomington. Adeep has been working on low-resourced NLP for over two years, mainly working on trying to overcome the lack of annotated data available in the said languages. He has assisted Dr. Chakravarthi in creating resources for low-resourced languages, belonging to the Dravidian language family, which have been published at COLING 2020 and ACL 22 associated events. He was also a part of the program committee for workshops organized at EACL 2021 and ACL 2022. He has published several papers on multimodal learning at ACL affiliated workshops and international journals. He has also served as an academic reviewer for Springer SN Computer Science and Elsevier Computer Speech & Language. He is a member of the Association of Computational Linguistics.
Email: ahande@iu.edu
Google Scholar: https://scholar.google.com/citations?user=XvfdrGsAAAAJ&hl=en
Prasanna Kumar Kumaresan, Indian Institute of Information Technology and Management- Kerala
Google Scholar: https://scholar.google.com/citations?user=6ZlifigAAAAJ&hl=en
Rahul Ponnusamy, Data Scientist at Techvantage Analytics
Google Scholar: https://scholar.google.com/citations?user=m6bYtawAAAAJ&hl=en