TRAC - 2022
Third (Virtual) Workshop on
Threat, Aggression and Cyberbullying
October 17, 2022
@ the 29th International Conference of Computational Linguistics (COLING 2022)
Gyeongju, the Republic of Korea
Deadline for regular submission extended till July 31, 2022!
The Workshop will be fully virtual this year!
Call for Papers
As the number of users and their web-based interaction has increased, incidents of verbal threat, aggression and related behavior like trolling, cyberbullying, and hate speech have also increased manifold globally. The reach and extent of Internet has given such incidents unprecedented power and influence to affect the lives of billions of people. Such incidents of online abuse have not only resulted in mental health and psychological issues for users, but they have manifested in other ways, spanning from deactivating social media accounts to instances of self-harm and suicide.
To mitigate these issues, researchers have begun to explore the use of computational methods for identifying such toxic interactions online. In particular, Natural Language Processing (NLP) and ML-based methods have shown great promise in dealing with such abusive behavior through early detection of inflammatory content.
In fact, we have observed an explosion of NLP-based research on offensive content in the last few years. This growth has been accompanied by the creation of new venues such as the WOAH and the TRAC workshop series. Community-based competitions, like tasks 5/6 at SemEval-2019, task 12 at SemEval-2020, task 5/7 at SemEval-2021 have also proven to be extremely popular. In fact, because of the huge community interest, multiple workshops being held on the topic in a single year. For example, in 2018 ACL hosted both the Abusive Language Online workshop (EMNLP) as well as TRAC-1 (COLING). Both venues achieved healthy participation with 21 and 24 papers, respectively. Interest in the topic has continued to grow since then and given its immense popularity, we are proposing a new edition of the workshop to support the community and further research in this area.
As in the earlier editions, TRAC will focus on the applications of NLP, ML and pragmatic studies on aggression and impoliteness to tackle these issues. We invite long (8 pages) and short papers (4 pages) as well as position papers and opinion pieces (5 - 20 pages) based on, but not limited to, any of the following themes from academic researchers, industry and any other group / team working in the area.
Theories and models of aggression and conflict in language.
Cyberbullying, threatening, hateful, aggressive and abusive language on the web.
Multilingualism and aggression.
Resource Development - Corpora, Annotation Guidelines and Best Practices for threat and aggression detection.
Computational Models and Methods for aggression, hate speech and offensive language detection in text and speech.
Detection of threats and bullying on the web.
Automatic censorship and moderation: ethical, legal and technological issues and challenges.
Shared Tasks
TRAC-2022 will include two novel shared tasks:
Task 1: Bias, Threat and Aggression Identification in Context
The first shared task will be a structured prediction task for recognising (a) Aggression, Gender Bias, Racial Bias, Religious Intolerance and Bias and Casteist Bias on social media and (b) the "discursive role" of a given comment in the context of the previous comment(s). The participants will be given a "thread" of comments with information about presence of different kinds of biases and threats (viz. gender bias, gendered threat and none, etc) and its discursive relationship to the previous comment as well as the original post (viz. attack, abet, defend, counter-speech and gaslighting). In a series / thread of comments, participants will be required to predict the presence of aggression and bias of each comment, with possibly making use of the context. For this task, a total dataset of approximately 60k comments (approximately 180k annotation samples) in Meitei, Bangla and Hindi, compiled in the ComMA Project, will be made available for training and testing. We are making available a disaggregated dataset for all the languages, each unit being annotated by at least 3 (or more) annotators.
Task 2: Generalising across domain - COVID-19
For this sub-task, the test set will be sampled from the COVID-19 related conversation, annotated with levels of aggression, offensiveness and hate speech. Across the globe, during the pandemic, we have seen various kinds of novel aggressive and biased conversation on social media - in fact, in some cases there was massive escalation of religious and other kinds of intolerance and polarisation. The participants of TRAC-1 and TRAC-2 shared tasks are especially encouraged to submit the predictions their their earlier models on this test set. They may also train new models jointly on both the datasets. Those who didn't participate in earlier tasks are also invited to submit the predictions for this task by training models on the two datasets and are encouraged to submit the predictions on the respective test sets of the earlier tasks along with the predictions on the current dataset (to enable comparison). New participants may also use TRAC-1 or TRAC-2 dataset or a combination of the two for building the models. The aim of the task is to evaluate the generalisability of our systems in unexpected and novel situations, along with their robustness in the earlier situations.
How to participate
Please use the Codalab Website for getting the data and uploading your submission.
Rules of participation
Each team is allowed to submit up to three systems (each task) for evaluation.
We expect each team to submit a system description paper after the evaluation. The deadline, length of submission and other instructions for the system description papers will be same as that for the workshop papers. All the system papers will be published in the proceedings and the best systems will be given slots for demos and presentations at the workshop.
Participants can use additional data for training the system. Just make sure that the dataset that you use is either already publicly available or you make it available immediately after submission (and well before the submission of your system paper) and you mention it in your submission. Use of non-public additional data for training will disqualify your system.
Evaluation
For both the tasks 1 and 2, we will be using f-score for evaluation. In sub-task 1, the predictions by the participants will be evaluated against annotation by each annotator and a weighted average micro f1-score will be used for reporting and ranking. This weighted average micro f1-score will be reported for each of the seven levels and then a final mean of each of the seven levels will be used for final ranking in task 1. Task 1 features evaluation on two test sets -
One test set will contain data from the same language as the train set.
The other test set will include some data from a surprise language in addition to those from the first test set. This is to test the generalisability of the multilingual models in zero-shot situations.
In sub-task 2, a mean micro f1-score across the four test sets of TRAC-2018, TRAC-2020, TRAC-2022 and the COVID-19 test set will be used for ranking the systems. The individual scores on each of the four test sets will also be reported. This tests which of the models work best across different domains.
Submission
We invite original and unpublished archival papers relevant to the theme of the workshop as one of the following -
Position Papers and Opinion Pieces (5 - 20 pages, excluding references)
Long Paper (completed work - 4-8 pages, excluding references)
Short Paper (work in progress - upto 4 pages, excluding references)
Demo of a working system / library / API (described in maximum 2 pages, with a link to the actual system, if available) - we strongly prefer open-source and free systems for demo at the workshop
Non-archival Extended Abstracts (previously published work, upto 2 pages). These will not be included in the archival proceedings but will be invited for a presentation at the workshop.
Both long and short papers may optionally include a demo and can be presented using PowerPoint or Poster, depending on the medium that is considered most appropriate for the paper by the Program Committee. Please note the workshop does not make any hierarchical distinction between papers and posters and the decision does NOT reflect the quality of the paper - the choice between the two mediums of presentation will only be made based on its suitability and appropriateness for presenting the content of the paper. More details about the presentation format will be given in due course of time.
Submission Website
Papers are to be submitted by the end of the deadline day via one of the following modes -
Directly submit your paper to the TRAC START submission website (to be reviewed by the TRAC Committee)
Commit the paper submitted to ACL-ARR to TRAC 2022 (already reviewed by ACL ARR Committee) and then upload the following on the TRAC START submission website -
Unique identifier of the ARR submission: the URL of the OpenReview forum of the ARR submission (https://openreview.net/forum?id=XXXXXXXXXXX) in the Summary field.
Authors must also upload the PDF of their paper in this track as well.
NOTE 1: In order to be considered for the workshop, the authors must make a submission via the START submission website and indicate the appropriate category of submission (Regular TRAC submission or ACL ARR submission). ONLY the papers submitted on the START website (including the ARR paper) will be considered for inclusion in the workshop.
NOTE 2: A paper may not be simultaneously under review through ARR and TRAC 2022. A paper that has or will receive reviews through ARR may not be submitted for review to TRAC 2022.
Style Files and Formatting
Submissions should be formatted according to the COLING-2022 template. As with the main conference, submissions will only be accepted in PDF format and deviations from the provided templates will result in rejections without review.
Each TRAC 2022 submission (following the COLING 2022 policy) can be accompanied by a single PDF appendix, one .tgz or .zip archive containing software, and one .tgz or .zip archive containing data. COLING 2022 encourages the submission of these supplementary materials to improve the reproducibility of results, and to enable authors to provide additional information that does not fit in the paper. For example, preprocessing decisions, model parameters, feature templates, lengthy proofs or derivations, pseudocode, sample system inputs/outputs, and other details that are necessary for the exact replication of the work described in the paper can be put into the appendix. However, the paper submissions need to remain fully self-contained, as these supplementary materials are completely optional, and reviewers are not even asked to review or download them. If the pseudo-code or derivations or model specifications are an important part of the contribution, or if they are important for the reviewers to assess the technical correctness of the work, they should be a part of the main paper, and not appear in the appendix. Supplementary materials need to be fully anonymized to preserve the double-blind reviewing policy.
Extra space for ethical considerations
Please note that extra space is allowed after the 8th page (4th page for short papers) for an ethics/broader impact statement. At submission time, this means that if you need extra space for the ethical considerations section, it should be placed after the conclusion so that it is possible to rapidly check that the rest of the paper still fits in 8 pages (4 pages for short papers). Ethical considerations sections, acknowledgements, and references do not count against these limits. For camera-ready versions 9 pages of content will be allowed for long (5 for short) papers. Ethical considerations sections, acknowledgements, and references do not count against these limits.
Important Dates (Workshop Papers)
Papers Due (directly submitted to TRAC):
July 11, 2022July 31, 2022(Sunday)Papers Due (committed via ACL-ARR): July 31, 2022 (Sunday)
Notification of Acceptance: August 22, 2022 (Monday)
Camera-ready papers due: September 5, 2022 (Monday)
Conference date: October 12-17, 2022
Timezone: Anywhere on Earth (UTC - 12)
Important Dates (Shared Task)
Training set release (Task 1): May 15, 2022
Test set release (for all Tasks):
June 25, 2022July 2, 2022Submissions due:
June 30, 2022July 7, 2022Results announcement: July 12, 2022
System description papers due: July 31, 2022
Reviews for papers: August 22, 2022
Camera-ready versions due:
September 5, 2022September 12, 2022Conference date: October 12-17, 2022
Timezone: Anywhere on Earth (UTC - 12)
Contact
For any queries, send an email to coling.aggression[at]gmail[dot]com
Organising Chairs
Ritesh Kumar, Dr. Bhimrao Ambedkar University, India
Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP, India
Marcos Zampieri, George Mason University, USA
Shervin Malmasi, Amazon Inc., USA
Daniel Kadar, Research Institute for Linguistics, Hungarian Academy of Sciences, Hungary
Assistant Organisers
Siddharth Singh, Dr. Bhimrao Ambedkar University, India
Shyam Ratan, Dr. Bhimrao Ambedkar University, India
Program Committee
Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP, India
Bharathi Raja Chakravarthi, University of Galway
Bornini Lahiri, Indian Institute of Technology-Kharagpur, India
Bruno Emanuel Martins, IST and INESC-ID
Cheng-Te Li, National Cheng Kung University, Taiwan
Chuan-Jie Lin, National Taiwan Ocean University, Taiwan
David Jurgens, University of Michigan
Denis Gordeev, The Russian Presidential Academy of National Economy and Public Administration under the President of the Russian Federation
Dennis Tenen, Columbia University, USA
Dhairya Dalal, University of Galway
Els Lefever, LT3, Ghent University, Belgium
Faneva Ramiandrisoa, IRIT
Han Liu, Cardiff University
Hugo Jair Escalante, INAOE, Mexico
Koustava Goswami, University of Galway
Liang-Chih Yu, Yuan Ze University, Taiwan
Lun-Wei Ku, Academia Sinica, Taiwan
Lütfiye Seda Mut Altın, Pompeu Fabra University
Mainack Mondal, University of Chicago, USA
Manuel Montes-y-Gómez, INAOE, Mexico
Marco Guerini, Fondazione Bruno Kessler, Trento
Ming-Feng Tsai, National Chengchi University, Taiwan
Monojit Choudhury, Microsoft Turing
Nemanja Djuric, Aurora Innovation
Parth Patwa, Indian Institute of Information Technology, Sri City
Preslav Nakov, Qatar Computing Research Institute, Qatar
Priya Rani, University of Galway
Ritesh Kumar, Dr. B. R. Ambedakar University, India
Roman Klinger, University of Stuttgart, Germany
Ruifeng Xu, Harbin Institute of Technology, China
Saja Tawalbeh, University of Antwerp
Sara E. Garza, Universidad Autónoma de Nuevo León (UANL), Mexico
Shardul Suryawanshi, University of Galway
Shubhanshu Mishra, Twitter Inc.
Valerio Basile, University of Turin
Veronique Hoste, LT3, Ghent University, Belgium
Xavier Tannier, Université Paris-Sud, LIMSI, CNRS, France
Zeerak Waseem, University of Sheffield, UK
Shared Task Organisers
Shervin Malmasi, Amazon Inc., USA
Siddharth Singh, Dr. Bhimrao Ambedkar University, India
Shyam Ratan, Dr. Bhimrao Ambedkar University, India
Ritesh Kumar, Dr. Bhimrao Ambedkar University, India
Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP, India
Bharathi Raja Chakravarthi, University of Galway