TRAC - 2022

Third (Virtual) Workshop on

Threat, Aggression and Cyberbullying

October 17, 2022

Deadline for regular submission extended till July 31, 2022!

The Workshop will be fully virtual this year!

Call for Papers

As the number of users and their web-based interaction has increased, incidents of verbal threat, aggression and related behavior like trolling, cyberbullying, and hate speech have also increased manifold globally. The reach and extent of Internet has given such incidents unprecedented power and influence to affect the lives of billions of people. Such incidents of online abuse have not only resulted in mental health and psychological issues for users, but they have manifested in other ways, spanning from deactivating social media accounts to instances of self-harm and suicide.

To mitigate these issues, researchers have begun to explore the use of computational methods for identifying such toxic interactions online. In particular, Natural Language Processing (NLP) and ML-based methods have shown great promise in dealing with such abusive behavior through early detection of inflammatory content.

In fact, we have observed an explosion of NLP-based research on offensive content in the last few years. This growth has been accompanied by the creation of new venues such as the WOAH and the TRAC workshop series. Community-based competitions, like tasks 5/6 at SemEval-2019, task 12 at SemEval-2020, task 5/7 at SemEval-2021 have also proven to be extremely popular. In fact, because of the huge community interest, multiple workshops being held on the topic in a single year. For example, in 2018 ACL hosted both the Abusive Language Online workshop (EMNLP) as well as TRAC-1 (COLING). Both venues achieved healthy participation with 21 and 24 papers, respectively. Interest in the topic has continued to grow since then and given its immense popularity, we are proposing a new edition of the workshop to support the community and further research in this area.

As in the earlier editions, TRAC will focus on the applications of NLP, ML and pragmatic studies on aggression and impoliteness to tackle these issues. We invite long (8 pages) and short papers (4 pages) as well as position papers and opinion pieces (5 - 20 pages) based on, but not limited to, any of the following themes from academic researchers, industry and any other group / team working in the area.

  • Theories and models of aggression and conflict in language.

  • Cyberbullying, threatening, hateful, aggressive and abusive language on the web.

  • Multilingualism and aggression.

  • Resource Development - Corpora, Annotation Guidelines and Best Practices for threat and aggression detection.

  • Computational Models and Methods for aggression, hate speech and offensive language detection in text and speech.

  • Detection of threats and bullying on the web.

  • Automatic censorship and moderation: ethical, legal and technological issues and challenges.

Shared Tasks

TRAC-2022 will include two novel shared tasks:

Task 1: Bias, Threat and Aggression Identification in Context

The first shared task will be a structured prediction task for recognising (a) Aggression, Gender Bias, Racial Bias, Religious Intolerance and Bias and Casteist Bias on social media and (b) the "discursive role" of a given comment in the context of the previous comment(s). The participants will be given a "thread" of comments with information about presence of different kinds of biases and threats (viz. gender bias, gendered threat and none, etc) and its discursive relationship to the previous comment as well as the original post (viz. attack, abet, defend, counter-speech and gaslighting). In a series / thread of comments, participants will be required to predict the presence of aggression and bias of each comment, with possibly making use of the context. For this task, a total dataset of approximately 60k comments (approximately 180k annotation samples) in Meitei, Bangla and Hindi, compiled in the ComMA Project, will be made available for training and testing. We are making available a disaggregated dataset for all the languages, each unit being annotated by at least 3 (or more) annotators.

Task 2: Generalising across domain - COVID-19

For this sub-task, the test set will be sampled from the COVID-19 related conversation, annotated with levels of aggression, offensiveness and hate speech. Across the globe, during the pandemic, we have seen various kinds of novel aggressive and biased conversation on social media - in fact, in some cases there was massive escalation of religious and other kinds of intolerance and polarisation. The participants of TRAC-1 and TRAC-2 shared tasks are especially encouraged to submit the predictions their their earlier models on this test set. They may also train new models jointly on both the datasets. Those who didn't participate in earlier tasks are also invited to submit the predictions for this task by training models on the two datasets and are encouraged to submit the predictions on the respective test sets of the earlier tasks along with the predictions on the current dataset (to enable comparison). New participants may also use TRAC-1 or TRAC-2 dataset or a combination of the two for building the models. The aim of the task is to evaluate the generalisability of our systems in unexpected and novel situations, along with their robustness in the earlier situations.


How to participate

Please use the Codalab Website for getting the data and uploading your submission.


Rules of participation

  • Each team is allowed to submit up to three systems (each task) for evaluation.

  • We expect each team to submit a system description paper after the evaluation. The deadline, length of submission and other instructions for the system description papers will be same as that for the workshop papers. All the system papers will be published in the proceedings and the best systems will be given slots for demos and presentations at the workshop.

  • Participants can use additional data for training the system. Just make sure that the dataset that you use is either already publicly available or you make it available immediately after submission (and well before the submission of your system paper) and you mention it in your submission. Use of non-public additional data for training will disqualify your system.


Evaluation

For both the tasks 1 and 2, we will be using f-score for evaluation. In sub-task 1, the predictions by the participants will be evaluated against annotation by each annotator and a weighted average micro f1-score will be used for reporting and ranking. This weighted average micro f1-score will be reported for each of the seven levels and then a final mean of each of the seven levels will be used for final ranking in task 1. Task 1 features evaluation on two test sets -

  • One test set will contain data from the same language as the train set.

  • The other test set will include some data from a surprise language in addition to those from the first test set. This is to test the generalisability of the multilingual models in zero-shot situations.

In sub-task 2, a mean micro f1-score across the four test sets of TRAC-2018, TRAC-2020, TRAC-2022 and the COVID-19 test set will be used for ranking the systems. The individual scores on each of the four test sets will also be reported. This tests which of the models work best across different domains.

Submission

We invite original and unpublished archival papers relevant to the theme of the workshop as one of the following -

  • Position Papers and Opinion Pieces (5 - 20 pages, excluding references)

  • Long Paper (completed work - 4-8 pages, excluding references)

  • Short Paper (work in progress - upto 4 pages, excluding references)

  • Demo of a working system / library / API (described in maximum 2 pages, with a link to the actual system, if available) - we strongly prefer open-source and free systems for demo at the workshop

  • Non-archival Extended Abstracts (previously published work, upto 2 pages). These will not be included in the archival proceedings but will be invited for a presentation at the workshop.

Both long and short papers may optionally include a demo and can be presented using PowerPoint or Poster, depending on the medium that is considered most appropriate for the paper by the Program Committee. Please note the workshop does not make any hierarchical distinction between papers and posters and the decision does NOT reflect the quality of the paper - the choice between the two mediums of presentation will only be made based on its suitability and appropriateness for presenting the content of the paper. More details about the presentation format will be given in due course of time.


Submission Website

Papers are to be submitted by the end of the deadline day via one of the following modes -

NOTE 1: In order to be considered for the workshop, the authors must make a submission via the START submission website and indicate the appropriate category of submission (Regular TRAC submission or ACL ARR submission). ONLY the papers submitted on the START website (including the ARR paper) will be considered for inclusion in the workshop.

NOTE 2: A paper may not be simultaneously under review through ARR and TRAC 2022. A paper that has or will receive reviews through ARR may not be submitted for review to TRAC 2022.


Style Files and Formatting

Submissions should be formatted according to the COLING-2022 template. As with the main conference, submissions will only be accepted in PDF format and deviations from the provided templates will result in rejections without review.

Each TRAC 2022 submission (following the COLING 2022 policy) can be accompanied by a single PDF appendix, one .tgz or .zip archive containing software, and one .tgz or .zip archive containing data. COLING 2022 encourages the submission of these supplementary materials to improve the reproducibility of results, and to enable authors to provide additional information that does not fit in the paper. For example, preprocessing decisions, model parameters, feature templates, lengthy proofs or derivations, pseudocode, sample system inputs/outputs, and other details that are necessary for the exact replication of the work described in the paper can be put into the appendix. However, the paper submissions need to remain fully self-contained, as these supplementary materials are completely optional, and reviewers are not even asked to review or download them. If the pseudo-code or derivations or model specifications are an important part of the contribution, or if they are important for the reviewers to assess the technical correctness of the work, they should be a part of the main paper, and not appear in the appendix. Supplementary materials need to be fully anonymized to preserve the double-blind reviewing policy.


Extra space for ethical considerations

Please note that extra space is allowed after the 8th page (4th page for short papers) for an ethics/broader impact statement. At submission time, this means that if you need extra space for the ethical considerations section, it should be placed after the conclusion so that it is possible to rapidly check that the rest of the paper still fits in 8 pages (4 pages for short papers). Ethical considerations sections, acknowledgements, and references do not count against these limits. For camera-ready versions 9 pages of content will be allowed for long (5 for short) papers. Ethical considerations sections, acknowledgements, and references do not count against these limits.

Important Dates (Workshop Papers)

  • Papers Due (directly submitted to TRAC): July 11, 2022 July 31, 2022(Sunday)

  • Papers Due (committed via ACL-ARR): July 31, 2022 (Sunday)

  • Notification of Acceptance: August 22, 2022 (Monday)

  • Camera-ready papers due: September 5, 2022 (Monday)

  • Conference date: October 12-17, 2022

Timezone: Anywhere on Earth (UTC - 12)

Important Dates (Shared Task)

  • Training set release (Task 1): May 15, 2022

  • Test set release (for all Tasks): June 25, 2022 July 2, 2022

  • Submissions due: June 30, 2022 July 7, 2022

  • Results announcement: July 12, 2022

  • System description papers due: July 31, 2022

  • Reviews for papers: August 22, 2022

  • Camera-ready versions due: September 5, 2022 September 12, 2022

  • Conference date: October 12-17, 2022

Timezone: Anywhere on Earth (UTC - 12)

Contact

For any queries, send an email to coling.aggression[at]gmail[dot]com

Organising Chairs

  • Ritesh Kumar, Dr. Bhimrao Ambedkar University, India

  • Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP, India

  • Marcos Zampieri, George Mason University, USA

  • Shervin Malmasi, Amazon Inc., USA

  • Daniel Kadar, Research Institute for Linguistics, Hungarian Academy of Sciences, Hungary

Assistant Organisers

  • Siddharth Singh, Dr. Bhimrao Ambedkar University, India

  • Shyam Ratan, Dr. Bhimrao Ambedkar University, India

Program Committee

  • Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP, India

  • Bharathi Raja Chakravarthi, University of Galway

  • Bornini Lahiri, Indian Institute of Technology-Kharagpur, India

  • Bruno Emanuel Martins, IST and INESC-ID

  • Cheng-Te Li, National Cheng Kung University, Taiwan

  • Chuan-Jie Lin, National Taiwan Ocean University, Taiwan

  • David Jurgens, University of Michigan

  • Denis Gordeev, The Russian Presidential Academy of National Economy and Public Administration under the President of the Russian Federation

  • Dennis Tenen, Columbia University, USA

  • Dhairya Dalal, University of Galway

  • Els Lefever, LT3, Ghent University, Belgium

  • Faneva Ramiandrisoa, IRIT

  • Han Liu, Cardiff University

  • Hugo Jair Escalante, INAOE, Mexico

  • Koustava Goswami, University of Galway

  • Liang-Chih Yu, Yuan Ze University, Taiwan

  • Lun-Wei Ku, Academia Sinica, Taiwan

  • Lütfiye Seda Mut Altın, Pompeu Fabra University

  • Mainack Mondal, University of Chicago, USA

  • Manuel Montes-y-Gómez, INAOE, Mexico

  • Marco Guerini, Fondazione Bruno Kessler, Trento

  • Ming-Feng Tsai, National Chengchi University, Taiwan

  • Monojit Choudhury, Microsoft Turing

  • Nemanja Djuric, Aurora Innovation

  • Parth Patwa, Indian Institute of Information Technology, Sri City

  • Preslav Nakov, Qatar Computing Research Institute, Qatar

  • Priya Rani, University of Galway

  • Ritesh Kumar, Dr. B. R. Ambedakar University, India

  • Roman Klinger, University of Stuttgart, Germany

  • Ruifeng Xu, Harbin Institute of Technology, China

  • Saja Tawalbeh, University of Antwerp

  • Sara E. Garza, Universidad Autónoma de Nuevo León (UANL), Mexico

  • Shardul Suryawanshi, University of Galway

  • Shubhanshu Mishra, Twitter Inc.

  • Valerio Basile, University of Turin

  • Veronique Hoste, LT3, Ghent University, Belgium

  • Xavier Tannier, Université Paris-Sud, LIMSI, CNRS, France

  • Zeerak Waseem, University of Sheffield, UK

Shared Task Organisers

  • Shervin Malmasi, Amazon Inc., USA

  • Siddharth Singh, Dr. Bhimrao Ambedkar University, India

  • Shyam Ratan, Dr. Bhimrao Ambedkar University, India

  • Ritesh Kumar, Dr. Bhimrao Ambedkar University, India

  • Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP, India

  • Bharathi Raja Chakravarthi, University of Galway