Tasks

Task 1: Propaganda Identification

The first subtask is a binary classification problem. The systems must decide whether or not a given tweet contains propaganda techniques. Some examples: 

Task 2: Propaganda Characterization

Once a message has been classified as propagandistic, the second task aims to categorize the message according to the type of propaganda. The proposed categorization considers multiple techniques identified in literature that are clustered according to their rhetorical features. We propose a multiclass, multilabel classification task, where systems have to decide, for each tweet in which of the available categories it fits.  The proposed typology can be found here


Evaluation will consider: 


a) a coarse grain categorization with four classes of propaganda (plus a negative class):


Baselines

Baselines for TASK 1 

For the first task, we will provide an English baseline given by the roberta-large (Liu, 2019) trained on binary classification: given a tweet, identify whether it is propaganda or not. For Spanish we will use the equivalent roberta-large, named MarIA (Gutiérrez-Fandiño, 2021). 


Baselines for TASK 2 

For the second task, we will provide two different baselines: 

References



Gutiérrez-Fandiño, A., Armengol-Estapé, J., Pàmies, M., Llop-Palao, J., Silveira-Ocampo, J., Carrino, C. P., Gonzalez-Agirre, A., Armentano-Oller, C., Rodriguez-Penagos, C., & Villegas, M. (2021). Spanish Language Models. arXiv:2107.07253 [cs]. http://arxiv.org/abs/2107.07253 


Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv:1907.11692 [Cs]. http://arxiv.org/abs/1907.11692