Task 1: Identifying Relevant Claims in Tweets

Don't forget to register through CLEF2022 Lab Registration before 22 April 2022, using this link. Otherwise, your submission will NOT be considered!

Task 1: Identifying Relevant Claims in Tweets

Definition

Subtask 1A: Check-worthiness of tweets: Given a tweet, predict whether it is worth fact-checking. This task is defined with binary labels: Yes and No.. This is a classification task. This subtasks runs in 6 languages:

  • Arabic

  • Bulgarian

  • Dutch

  • English

  • Spanish

  • Turkish

Subtask 1B: Verifiable factual claims detection: Given a tweet, predict whether it contains a verifiable factual claim. This is a binary task with two labels: Yes and No. This is a classification task. This subtasks runs in 5 languages:

  • Arabic

  • Bulgarian

  • Dutch

  • English

  • Turkish

Subtask 1C: Harmful tweet detection : Given a tweet, predict whether it is harmful to the society and why. This task is defined with binary labels: Yes and No. This is a classification task. This subtasks runs in 5 languages:

  • Arabic

  • Bulgarian

  • Dutch

  • English

  • Turkish

Subtask 1D: Attention-worthy tweet detection: Given a tweet, predict whether it should get the attention of policy makers and why. This task is defined with nine class labels: (i) No, not interesting, (ii) Yes, asks question, (iii) Yes, blame authorities, (iv) Yes, calls for action, (v) Yes, classified as in harmful task, (vi) Yes, contains advice, (vii) Yes, discusses action taken, (viii) Yes, discusses cure, and (ix) Yes, other. This is a classification task. This subtasks runs in 5 languages:

  • Arabic

  • Bulgarian

  • Dutch

  • English

  • Turkish

Evaluation

This task is evaluated as a classification task. For the subtasks 1A and 1C, we use the F1 measure with respect to the positive class, for subtask 1B, we use accuracy, and for subtask 1D, we use weighted-F1.

Datasets

Scorers, Format Checkers, and Baseline Scripts

All scripts can be found on the main repo for the lab, CheckThat! Lab Task 1: https://gitlab.com/checkthat_lab/clef2022-checkthat-lab/clef2022-checkthat-lab/-/tree/main/task1

Submission Guidelines

  • Make sure that you create one account for each team, and submit it through one account only.

  • We will keep the leaderboard private till the end of the submission period, hence, results will not be available upon submission. All results will be available after the evaluation period.

  • You are allowed to submit max 200 submissions per day for each subtask.

  • The last file submitted to the leaderboard will be considered as the final submission.

  • Name of the output file have to be "subtask1[A/B/C/D]_SHORT-NAME-OF-THE-SUBTASK_LANG.tsv" with ".tsv" extension (e.g., subtask1B_claim_arabic.tsv); otherwise, you will get an error on the leaderboard. Subtask are 1A, 1B, 1C, 1D and short name of the subtasks are checkworthy, claim, harmful, and attentionworthy. For task 1, there are six languages (Arabic, Buglarian, Durch, English, Spanish and Turkish).

  • You have to zip the tsv, "zip subtask1B_claim_arabic.zip subtask1B_claim_arabic.tsv" and submit it through the codalab page.

Leaderboard and Submission Site

Please submit your results on test data here: https://codalab.lisn.upsaclay.fr/competitions/4230