CLEF2021 - CheckThat! Lab

Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News

Welcome to CheckThat! Lab at CLEF 2021, the 4th version of the lab!

Proceedings

Main Proceedings: Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum
- Task 1 paper: http://ceur-ws.org/Vol-2936/paper-28.pdf
- Task 2 paper: http://ceur-ws.org/Vol-2936/paper-29.pdf
- Task 3 paper: http://ceur-ws.org/Vol-2936/paper-30.pdf

Lab Program

This year, lab activities will run on Tuesday and Wednesday, 21 & 22 September. All times follow Bucharest timezone (GMT+3).

[21/9] CLEF Plenary Session

[21/9] Session 1: 11:30 - 13:00 (Chair: Giovanni Da San Martino)

[21/9] Session 2: 15:30 - 17:00 (Chair: Julia Maria Struß)

[21/9] Session 3: 17:30 - 19:00 (Chair: Alberto Barron-Cedeno)

[24/9] Closing Session: 17:30 - 19:00

CLEF-2022 CheckThat! Lab

Invited talks

Invited talk: Fact-checking as a conversation

Abstract: Misinformation is considered one of the major challenges of our times resulting in numerous efforts against it. Fact-checking, the task of assessing whether a claim is true or false, is considered a key weapon in reducing its impact. In the first part of this talk I will present our recent and ongoing work on automating this task using natural language processing, moving beyond simply classifying claims as true or false in the following aspects: returning evidence for the predictions, factually correcting the claims and adversarial evaluation. In the second part of this talk, I will present an alternative approach to combatting misinformation via dialogue agents, and present results on how internet users engage in constructive disagreements and problem-solving deliberation.

Speaker: Andreas Vlachos
Bio: I am a senior lecturer at the Natural Language and Information Processing group at the Department of Computer Science and Technology at the University of Cambridge. Current projects include dialogue modelling, automated fact checking and imitation learning. I have also worked on semantic parsing, natural language generation and summarization, language modelling, information extraction, active learning, clustering and biomedical text mining. My research team is supported by grants from ERC, EPSRC, ESRC, Facebook, Amazon, Google, Huawei and the Isaac Newton Trust.

Invited talk: Research on Mitigating Misinformation at the IDIR Lab

Abstract: Our society is struggling with an unprecedented amount of falsehood. Human fact-checkers cannot keep up with the volume of online misinformation and the speed at which they spread. This challenge creates an opportunity for automated fact-checking systems. We have been building ClaimBuster, an end-to-end system for data-driven fact-checking. While the development of the full-fledged system is still ongoing, several components of ClaimBuster are integrated and deployed at http://idir.uta.edu/claimbuster/. The ClaimBuster API is being used by the Duke Reporters’ Lab to create daily newsletters that recommend the most check-worthy claims to The Washington Post, PolitiFact, and other professional fact-checkers. The project is part of the IDIR Lab's inter-disciplinary research program in computational journalism. Under this program, we have also extensively worked on several other projects related to mitigating online misinformation. This talk will present a high-level overview of these projects, followed by a brief discussion of several directions of ongoing efforts.

Speaker: Chengkai Li
Bio: Dr. Chengkai Li is a Professor in the Department of Computer Science and Engineering at the University of Texas at Arlington. His research interests span several areas related to big data intelligence and data science, including databases, data mining and applied machine learning, natural language processing, and their applications in computational journalism. Particularly, his ongoing research projects include data-driven fact-checking, exceptional fact finding, and usability challenges in querying and exploring knowledge graphs. His publications received several awards at SIGMOD, VLDB, and CIDR. Dr. Chengkai Li received his Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign. He graduated from Nanjing University with an M.Eng. degree and a B.S. degree in Computer Science.

Invited talk: Fact-checking on encrypted chat applications

Abstract: In this talk, I will present our work on fact-checking on WhatsApp. First, I will talk about methods we developed to collect data from WhatsApp at scale. Next, I will talk about research done using such data, and how it differs from other open social media platforms like Twitter, Facebook or Reddit. Finally, I will talk about our approach on building pipelines to fact checking content in an end to end encrypted setting. I will end with a vision of challenges and opportunities in this space.

Speaker: Kiran Garimella
Bio: Kiran Garimella’s research deals with using large-scale data to tackle societal issues such as misinformation, political polarization, or hate speech. Prior to joining Rutgers, Dr. Garimella was the Michael Hammer postdoc at the Institute for Data, Systems and Society at MIT. Before joining MIT, he was a postdoc at EPFL, Switzerland. His work on studying and mitigating polarization on social media won the best student paper awards at top computer science conferences. Kiran received his Ph.D. in computer science at Aalto University, Finland, and Masters & Bachelors from IIIT Hyderabad, India. Prior to his Ph.D., he worked as a Research Engineer at Yahoo Research, Barcelona, and QCRI, Doha.

Recent Updates

09 May 2021: Deadline for Task 3 is extended until Monday, 10th of May.
01 May 2021: Leaderboard for Task1A-Spanish is released.
01 May 2021: Test input data for all subtasks of Task 1 and Task 2 are released.
01 May 2021: Task-3 leaderboard and submission site released.
01 May 2021: Training data for subtask-3b is released
30 April 2021: Second batch of training data for subtask-3a is released.
26 April 2021: Task-2 leaderboard and submission site released.
26 April 2021: Task-1 leaderboard and submission site released.
21 April 2021: English first batch of training data for subtask-3a is released (please don't forget to send the data sharing agreement)
07 April 2021: English appetizer data for subtask-3a is released (training data will follow next week).
06 April 2021: English training data for subtask-2a is released.
17 March 2021: Arabic training data for subtask-2a is released.
13 March 2021: Arabic training data for subtask-1a is released.
03 March 2021: Turkish training data for subtask-1a is released.
23 February 2021: Bulgarian training data for subtask-1a is released.
22 February 2021: English training data for subtask-2b is released.
07 February 2021: English training data for subtask-1b is released.
07 February 2021: English training data for subtask-1a is released.
19 January 2021: Spanish training data for subtask-1a is released.
October 8, 2020: Website is up!

Lab Registration

To register in the CheckThat! lab, please visit: http://clef2021-labs-registration.dei.unipd.it/registrationForm.php

Leaderboard and Submission sites

To submit and view your results on the test and dev data, kindly use the following leaderboard:

Tasks

Check the overview of the tasks here or in the following task specific task pages.

Task 1 - Check-Worthiness Estimation : Given a claim, detect whether it is worth fact-checking.

Datasets

Task 2 - Verified Claim Retrieval: Given a check-worthy claim, and a set of previously fact-checked claims, determine whether the claim has been previously fact-checked.

Datasets:

Task 3 - Fake News Detection: Given the text of a news article, determine whether the claims made in the article are true, partially true, false or other (e.g., claims in dispute) and also detect the topical domain of the article.

Datasets:

You can also check out the main repo containing all the scripts and data, CheckThat! Lab-2021.

Important Dates

All times are Any where On Earth (AOE).

16 November 2020: Registration opens
30 April 2021: Registration closes
~~7 May 2021~~ 8 May 2021: End of Evaluation Cycle
28 May 2021: Submission of Participant Papers [CEUR-WS]

Discussion Group

Please join our discussion group clef-factcheck@googlegroups.com to receive announcements and participate in discussions.

Organizers

Preslav Nakov, Qatar Computing Research Institute, HBKU
Giovanni Da San Martino, Qatar Computing Research Institute, HBKU
Tamer Elsayed, Qatar University
Alberto Barrón-Cedeño, Università di Bologna
Rubén Míguez, Newtral Media Audiovisual, Spain
Firoj Alam, Qatar Computing Research Institute, HBKU
Shaden Shaar, Qatar Computing Research Institute, HBKU
Maram Hasanain, Qatar University
Fatima Haouari, Qatar University
Nikolay Babulkov, Sofia University
Alex Nikolov, Sofia University
Thomas Mandl, University of Hildesheim
Julia Maria Struß, University of Applied Sciences Potsdam
Gautam Kishore Shahi, University of Duisburg-Essen
Sandip Modha, LDRP Institute of Technology and Research
Mucahid Kutlu, TOBB Economy and Technology University
Yavuz Selim Kartal, TOBB Economy and Technology University

Citation

You can find the overview papers on the CLEF2021-CheckThat! Lab as well as of the individual tasks below:

@InProceedings{CheckThat:ECIR2021,

author = {Preslav Nakov and

Da San Martino, Giovanni and

Tamer Elsayed and

Alberto Barr{\'{o}}n{-}Cede{\~{n}}o and

Rub\'{e}n M\'{i}guez and

Shaden Shaar and

Firoj Alam and

Fatima Haouari and

Maram Hasanain and

Nikolay Babulkov and

Alex Nikolov and

Shahi, Gautam Kishore and

Struß, Julia Maria and

Thomas Mandl},

title = {The {CLEF}-2021 {CheckThat}! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News},

booktitle = {Proceedings of the 43rd European Conference on Information Retrieval},

series = {ECIR~'21},

pages = {639--649},

address = {Lucca, Italy},

month = {March},

year = {2021},

url = {https://link.springer.com/chapter/10.1007/978-3-030-72240-1_75},

}

@InProceedings{clef-checkthat:2021:LNCS,

author = {Preslav Nakov and

Da San Martino, Giovanni and

Tamer Elsayed and

Alberto Barr{\'{o}}n{-}Cede{\~{n}}o and

Rub\'{e}n M\'{i}guez and

Shaden Shaar and

Firoj Alam and

Fatima Haouari and

Maram Hasanain and

Watheq Mansour and

Bayan Hamdan and

Zien Sheikh Ali and

Nikolay Babulkov and

Alex Nikolov and

Shahi, Gautam Kishore and

Struß, Julia Maria and

Thomas Mandl and

Mucahid Kutlu and

Yavuz Selim Kartal},

title = "Overview of the {CLEF}-2021 {CheckThat}! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News",

year = {2021},

booktitle = "Proceedings of the 12th International Conference of the CLEF Association: Information Access Evaluation Meets Multiliguality, Multimodality, and Visualization",

series = {CLEF~'2021},

address = {Bucharest, Romania (online)},

url ={https://link.springer.com/chapter/10.1007/978-3-030-85251-1_19}

}

@InProceedings{clef-checkthat:2021:task1,

author = {Shaden Shaar and

Maram Hasanain and

Bayan Hamdan and

Zien Sheikh Ali and

Fatima Haouari and

Alex Nikolov,

Mucahid Kutlu and

Yavuz Selim Kartal,

Firoj Alam and

Da San Martino, Giovanni and

Alberto Barr{\'{o}}n{-}Cede{\~{n}}o and

Rub\'{e}n M\'{i}guez and

Tamer Elsayed and

Preslav Nakov},

title = "Overview of the {CLEF}-2021 {CheckThat}! Lab Task 1 on Check-Worthiness Estimation in Tweets and Political Debates",

year = {2021},

booktitle = "Working Notes of CLEF 2021---Conference and Labs of the Evaluation Forum",

series = {CLEF~'2021},

address = {Bucharest, Romania (online)},

url={http://ceur-ws.org/Vol-2936/paper-28.pdf}

}

@InProceedings{clef-checkthat:2021:task2,

author = {Shaden Shaar and

Fatima Haouari and

Watheq Mansour and

Maram Hasanain and

Nikolay Babulkov and

Firoj Alam and

Da San Martino, Giovanni and

Tamer Elsayed and

Preslav Nakov},

title = "Overview of the {CLEF}-2021 {CheckThat}! Lab Task 2 on Detecting Previously Fact-Checked Claims in Tweets and Political Debates",

year = {2021},

booktitle = "Working Notes of CLEF 2021---Conference and Labs of the Evaluation Forum",

series = {CLEF~'2021},

address = {Bucharest, Romania (online)},

url={http://ceur-ws.org/Vol-2936/paper-29.pdf}

}

@InProceedings{clef-checkthat:2021:task3,

author = {Shahi, Gautam Kishore and

Struß, Julia Maria and

Thomas Mandl},

title = "Overview of the {CLEF}-2021 {CheckThat}! Lab Task 3 on Fake News Detection",

year = {2021},

booktitle = "Working Notes of CLEF 2021---Conference and Labs of the Evaluation Forum",

series = {CLEF~'2021},

address = {Bucharest, Romania (online)},

url={http://ceur-ws.org/Vol-2936/paper-30.pdf}

}

CLEF2021 - CheckThat! Lab

Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News

Welcome to CheckThat! Lab at CLEF 2021, the 4th version of the lab!

CheckThat! 2020

CheckThat! 2019

CheckThat! 2018

Proceedings

Lab Program

[21/9] CLEF Plenary Session

[21/9] Session 1: 11:30 - 13:00 (Chair: Giovanni Da San Martino)

[21/9] Session 2: 15:30 - 17:00 (Chair: Julia Maria Struß)

[21/9] Session 3: 17:30 - 19:00 (Chair: Alberto Barron-Cedeno)

[24/9] Closing Session: 17:30 - 19:00

Invited talks

Recent Updates

Lab Registration

Leaderboard and Submission sites

Tasks

Important Dates

Discussion Group

Organizers

Citation

Previous Editions of CheckThat!