Task Definition: Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other (e.g., claims in dispute) and also detect the topical domain of the article. This task will run in English.
Subtask 3A: Multi-class fake news detection of news articles (English): Sub-task A would be the detection of fake news designed as a four-class classification problem. The training data will be released in batches and will be roughly about 1,000 articles with the respective label. Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other. Our definitions for the categories are as follows:
False - The main claim made in an article is untrue.
Partially False - The main claim of an article is a mixture of true and false information. The article contains partially true and partially false information but cannot be considered as 100% true. It includes all articles in categories like partially false, partially true, mostly true, miscaptioned, misleading etc., as defined by different fact-checking services.
True - This rating indicates that the primary elements of the main claim are demonstrably true.
Other- An article that cannot be categorised as true, false, or partially false due to lack of evidence about its claims. This category includes articles in dispute and unproven articles.
Subtask 3B: Fact-checkers require background expertise to identify the truthfulness of an article. The categorisation will help to automate the sampling process from a stream of data. Given the text of a news article, determine the topical domain of the article (English). This is a classification problem. The task is to categorise fake news articles into five or more different topical categories like health, election, conspiracy theory etc. This task will be offered for a subset of the data of Subtask 3A.
File Format
These files contain an id, the title of the article and the text of the article. The training data contains a fourth column with the label. The participants need to submit a tsv file containing the id and the label.
Evaluation
This task is evaluated as a classification task. We will use the F1-macro measure for the ranking of teams.
Datasets
We are sharing the dataset only for the context of our task which avoids its use for any commercial purpose.
Data set creation
For the task, the sites of several fact checking sites were analysed. For claims checked, the original article was identified manually. The text of this published fake news documents was then added to the collection with the decision label from the fact checking sites. The labels were aggregated because different sites use different labels. (see following figure)