The PIs are John Pavlopoulos, Ion Androutsopoulos, Theos Evgeniou and Xenia Miscouridou.
John and Ion are co-directors of AUEB’s NLP Group, the leading academic NLP research group in Greece. Among other research topics, they have been involved in research aiming to improve online discussions since 2017, initially by developing deep learning-based toxicity classifiers and user embeddings in the context of a DNI Google-funded project [8,12]. Their more recent work has shown how highlighting toxic spans can be important for moderation purposes [8], work that led to the development of an important dataset for the field and an international academic challenge (Toxic Spans Detection at SemEval 2021) [6]. In other recent work, they studied the importance of the conversational context of posts when moderating for toxicity, an issue that had been neglected in previous research, and they developed context-aware toxicity datasets, benchmarks, and models [3,5,7]. Their commitment to improve online discussions is not limited to toxicity detection, but also regards toxicity mitigation (e.g., by suggesting non-toxic paraphrases of toxic posts) [4,9]. Both PIs and their group have very substantial experience with LLMs, having released, e.g., GreekBERT [10], LegalBERT [11], and more recently a parameter-efficiently tuned PaLM, which ranked first in the SemEval 2023 misogyny detection challenge [2]. The proposed work towards moderation LLMs for online discussions is, therefore, a natural extension of their previous work, which has already considered LLMs for context-aware moderation and toxicity mitigation.
Theos has been working on Machine Learning and AI for more than 25 years in areas ranging from machine learning theory, new machine learning methods, recommender systems, marketing, as well as more recently on AI risks and regulations. His work has appeared in journals such as Science Magazine, Machine Learning, Nature Machine Intelligence, Journal of Machine Learning Research, Nature Digital Medicine, Lancet Digital Health, Management Science, Marketing Science, and others. He has four degrees from MIT, two BS degrees simultaneously, Computer Science and Mathematics, a Master, and a PhD degree in Computer Science.
Xenia is working on statistical machine learning, developing methodologies that combine the strengths of both statistics and AI. She is focusing on (graph) network modelling, stochastic processes and explainable AI. She was one of the Forbes under 30 in Science and Healthcare having helped characterize a new Covid variant of concern and its impact in Brazil, published in Science, which was used by policymakers such as WHO. Besides serving as a Lecturer at the Department of Mathematics and Statistics at the University of Cyprus, she is an Honorary/Visiting Lecturer at Imperial College London, within the Department of Mathematics and Imperial X, a new center for driving innovation in digital technologies, machine learning and artificial intelligence. In 2022-2023 she was a Lecturer at Imperial College London. She is also part of Machine Learning & Global Health Network, a multi-organisation network across UK, Denmark, Singapore and Germany.
[1] Korre, K., Pavlopoulos, Sorensen, J., Laugier, L., Androutsopoulos I., Dixon, L., and Barrón-Cedeño, A., "Harmful Language Datasets: An Assessment of Robustness". In Proceedings of WOAH, pp. 221-230. 2023.
[2] Sorensen, J., Korre, K., Pavlopoulos, J., Tomanek, K., Thain, N., Dixon, L. and Laugier, L., "JUAGE at SemEval-2023 Task 10: Parameter Efficient Classification". In Proceedings of SemEval, pp. 1195-1203. 2023.
[3] Xenos, A., Pavlopoulos, J., Androutsopoulos, I., Dixon, L., Sorensen, J., and Laugier, L., "Toxicity detection sensitive to conversational context". First Monday (2022).
[4] Pavlopoulos, J., Laugier, L., Xenos, A., Sorensen, J., and Androutsopoulos, I., "From the detection of toxic spans in online discussions to the analysis of toxic-to-civil transfer". In Proceedings of ACL, pp. 3721-3734. 2022.
[5] Xenos, A., Pavlopoulos, J., and Androutsopoulos, I., "Context sensitivity estimation in toxicity detection". In Proceedings of WOAH, pp. 140-145. 2021.
[6] Pavlopoulos, J., Sorensen, J., Laugier, L., and Androutsopoulos, I., "SemEval-2021 task 5: Toxic spans detection". In Proceedings of SemEval, pp. 59-69. 2021.
[7] Pavlopoulos, J., Sorensen, J., Dixon, L., Thain, N., and Androutsopoulos, I., "Toxicity Detection: Does Context Really Matter?". In Proceedings of ACL, pp. 4296-4305. 2020.
[8] Pavlopoulos, J., Malakasiotis, P., and Androutsopoulos, I., "Deeper attention to abusive user content moderation". In Proceedings of EMNLP, pp. 1125-1135. 2017.
[9] Laugier, L., Pavlopoulos, J., Sorensen, J. and Dixon, L.. “Civil Rephrases Of Toxic Texts With Self-Supervised Transformers”. In Proceedings of EACL (pp. 1442-1461). 2021.
[10] Koutsikakis, J., Chalkidis, I., Malakasiotis, P. and Androutsopoulos, I., “Greek-bert: The greeks visiting sesame street”. In the 11th Hellenic Conference On Artificial Intelligence (pp. 110-117). 2020.
[11] Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Aletras, N. and Androutsopoulos, I., “LEGAL-BERT: The muppets straight out of law school”. arXiv preprint arXiv:2010.02559. 2020.
[12] Pavlopoulos, J., Malakasiotis, P., Bakagianni, J., and Androutsopoulos, I., “Improved Abusive Comment Moderation with User Embeddings”. In Proceedings of the EMNLP Workshop: Natural Language Processing meets Journalism, pages 51–55, Copenhagen, Denmark. ACL. 2017.