Tasks
Task 1: Propaganda Identification
The first subtask is a binary classification problem. The systems must decide whether or not a given tweet contains propaganda techniques. Some examples:
Propaganda:
In West, Islam is considered alien to its western value. In China,that divisive distinction is not made as Islam been part of Chinese culture & civilisation for 1000yrs plus. Unfortunately, fundamental forces from West & some conservative strand of Islam are trying to sow divisions.
Not propaganda:
On the Eastern Mediterranean and Turkey, we are clear and determined in defending EU's interests and solidarity with Greece and Cyprus. We want to find paths towards a healthier relationship. It is in the mutual interest of both the EU and Turkey
Task 2: Propaganda Characterization
Once a message has been classified as propagandistic, the second task aims to categorize the message according to the type of propaganda. The proposed categorization considers multiple techniques identified in literature that are clustered according to their rhetorical features. We propose a multiclass, multilabel classification task, where systems have to decide, for each tweet in which of the available categories it fits. The proposed typology can be found here.
Evaluation will consider:
a) a coarse grain categorization with four classes of propaganda (plus a negative class):
Group 0. Not propagandistic
Group 1. Appeal to Commonality
China adheres to the path of peaceful development & will never seek hegemony or engage in expansion, a pledge that has never been made by #US. China is a force for good in the world, a force for global peace & prosperity. China does not want to threaten, challenge or replace anyone.
Group 2: Discrediting the Opponent
US's political suppression on Chinese journalists and media organizations exposes its Cold War mentality and hypocrisy of so-called freedom of the press
Group 3: Loaded Language
The Radical Left, Do Nothing Democrats keep chanting when they put on the most unfair Witch Hunt in the history of the U.S. Congress. They had 17 Witnesses, we were allowed ZERO, and no lawyers. They didn't do their job, had no case. The Dems are scamming America!
Group 4: Appeal to Authority
The U.S. commends Italy for setting an example for its neighbors by again repatriating and prosecuting an Italian citizen who allegedly traveled to Syria to support ISIS. We urge other Western European countries to follow suit and take responsibility for their citizens.
b) a fine-grained categorization with 15 subclasses (plus a negative class): Flag Waving, Ad Populum / Ad antiquitatem, Name Calling, Undiplomatic Assertiveness / Whataboutism, Scapegoating, Propaganda Slinging, Appeal to Fear, Demonization, Personal Attacks, Doubt, Reductio Ad Hitlerum, Loaded Language, Appeal to False Authority and Bandwagoning.
Baselines
Baselines for TASK 1
For the first task, we will provide an English baseline given by the roberta-large (Liu, 2019) trained on binary classification: given a tweet, identify whether it is propaganda or not. For Spanish we will use the equivalent roberta-large, named MarIA (Gutiérrez-Fandiño, 2021).
Baselines for TASK 2
For the second task, we will provide two different baselines:
a roberta-large/MarIA trained on all the classes of propaganda, including the negative class.
a roberta-large/MarIA trained exclusively on positive classes of propaganda, excluding the negative class, operating on the output of the baseline for task 1.
References
Gutiérrez-Fandiño, A., Armengol-Estapé, J., Pàmies, M., Llop-Palao, J., Silveira-Ocampo, J., Carrino, C. P., Gonzalez-Agirre, A., Armentano-Oller, C., Rodriguez-Penagos, C., & Villegas, M. (2021). Spanish Language Models. arXiv:2107.07253 [cs]. http://arxiv.org/abs/2107.07253
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv:1907.11692 [Cs]. http://arxiv.org/abs/1907.11692