Datasets & Tools
Arabic Datasets
All Arabic datasets will be added gradually to the repository here.
Task 1:
CT20-AR-Train-T1: dataset includes 3 training topics, 1,500 tweets and corresponding check-worthiness labels.
CT20-AR-Test-T1: dataset includes 12 testing topics, 6,000 tweets and corresponding check-worthiness labels.
Task 3:
Training can be done using train dataset and test dataset for sub-task C in 2019 edition of the lab.
Task 4:
Training can be done using train dataset and test dataset for sub-task D in 2019 edition of the lab.
English Datasets
Task 1:
CT20-EN-Train-T1: dataset includes 1 training topics, 488 tweets and corresponding check-worthiness labels
CT20-EN-Test-T1: 140 tweets of the same training topic has been released as Test Data.
Task 2:
CT20-EN-Train-T2: dataset includes 1,003 tweets and corresponding 10,373 verified claims.
CT20-EN-Test-T2: 200 tweets to be matched against the 10,373 already verified claims released as Test Data
Task 5:
CT20-EN-Train-T5: dataset includes 50 fact-checked documents - debates, speeches, press conferences, etc.
CT20-EN-Test-T5: 20 debates has been released as Test Data.