Datasets & Tools

Arabic Datasets

All Arabic datasets will be added gradually to the repository here.

Task 1:

    • CT20-AR-Train-T1: dataset includes 3 training topics, 1,500 tweets and corresponding check-worthiness labels.

    • CT20-AR-Test-T1: dataset includes 12 testing topics, 6,000 tweets and corresponding check-worthiness labels.

Task 3:

Task 4:

English Datasets

Task 1:

    • CT20-EN-Train-T1: dataset includes 1 training topics, 488 tweets and corresponding check-worthiness labels

    • CT20-EN-Test-T1: 140 tweets of the same training topic has been released as Test Data.

Task 2:

    • CT20-EN-Train-T2: dataset includes 1,003 tweets and corresponding 10,373 verified claims.

    • CT20-EN-Test-T2: 200 tweets to be matched against the 10,373 already verified claims released as Test Data

Task 5:

    • CT20-EN-Train-T5: dataset includes 50 fact-checked documents - debates, speeches, press conferences, etc.

    • CT20-EN-Test-T5: 20 debates has been released as Test Data.