LREC 2022 Tutorial

on

Building Reliable Datasets for Aggressive and Hateful Language Identification: Theory, Taxonomies and Approaches

May 20, 2022

Marseille, Paris

Overview

Speakers

Organisers

Tutorial Outline

The tutorial is broadly divided into two parts - each of 90 minutes, with each containing three broad modules of roughly 30 minutes. A broad outline of the topics to be covered during the tutorial is given below -

PART 1 [90 minutes]

1. Introduction - An overview of im/politeness and its definition
2. Sociopragmatic Models of im/politeness
3. Pragmalinguistic research involving (im)politeness and aggression: Ritual Frame Indicating Expressions Theory
  - Comparative Analysis of English and German
  - Comparative Analysis of English and Chinese
  - Comparative Analysis of English, Hindi and Bangla

PART 2 [90 minutes]

Major annotation taxonomies in NLP
- Offensive Language
- Abusive Language
- Hate Speech
- Aggressive Language
Annotating datasets for abusive language identification
- Building datasets from scratch
- Analysis to datasets and datasets to analysis
Sociopragmatic models and Mapping Multiple Datasets
- Mapping Aggressive and Offensive Language Datasets
- Mapping Hate Speech and Abusive Language Datasets

For the second part, the audience will be given a small dataset to analyse, and the audience will also be encouraged to bring their own small dataset(s) for analysis.

Page updated

Google Sites

Report abuse