CLEF2021 - CheckThat! Lab
Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News
Welcome to CheckThat! Lab at CLEF 2021, the 4th version of the lab!
Proceedings
Main Proceedings: Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum
Task 1 paper: http://ceur-ws.org/Vol-2936/paper-28.pdf
Task 2 paper: http://ceur-ws.org/Vol-2936/paper-29.pdf
Task 3 paper: http://ceur-ws.org/Vol-2936/paper-30.pdf
Lab Program
This year, lab activities will run on Tuesday and Wednesday, 21 & 22 September. All times follow Bucharest timezone (GMT+3).
[21/9] CLEF Plenary Session
[21/9] Session 1: 11:30 - 13:00 (Chair: Giovanni Da San Martino)
[21/9] Session 2: 15:30 - 17:00 (Chair: Julia Maria Struß)
[21/9] Session 3: 17:30 - 19:00 (Chair: Alberto Barron-Cedeno)
[24/9] Closing Session: 17:30 - 19:00
CLEF-2022 CheckThat! Lab
Invited talks
Invited talk: Fact-checking as a conversation
Abstract: Misinformation is considered one of the major challenges of our times resulting in numerous efforts against it. Fact-checking, the task of assessing whether a claim is true or false, is considered a key weapon in reducing its impact. In the first part of this talk I will present our recent and ongoing work on automating this task using natural language processing, moving beyond simply classifying claims as true or false in the following aspects: returning evidence for the predictions, factually correcting the claims and adversarial evaluation. In the second part of this talk, I will present an alternative approach to combatting misinformation via dialogue agents, and present results on how internet users engage in constructive disagreements and problem-solving deliberation.
Speaker: Andreas Vlachos
Bio: I am a senior lecturer at the Natural Language and Information Processing group at the Department of Computer Science and Technology at the University of Cambridge. Current projects include dialogue modelling, automated fact checking and imitation learning. I have also worked on semantic parsing, natural language generation and summarization, language modelling, information extraction, active learning, clustering and biomedical text mining. My research team is supported by grants from ERC, EPSRC, ESRC, Facebook, Amazon, Google, Huawei and the Isaac Newton Trust.
Invited talk: Research on Mitigating Misinformation at the IDIR Lab
Abstract: Our society is struggling with an unprecedented amount of falsehood. Human fact-checkers cannot keep up with the volume of online misinformation and the speed at which they spread. This challenge creates an opportunity for automated fact-checking systems. We have been building ClaimBuster, an end-to-end system for data-driven fact-checking. While the development of the full-fledged system is still ongoing, several components of ClaimBuster are integrated and deployed at http://idir.uta.edu/claimbuster/. The ClaimBuster API is being used by the Duke Reporters’ Lab to create daily newsletters that recommend the most check-worthy claims to The Washington Post, PolitiFact, and other professional fact-checkers. The project is part of the IDIR Lab's inter-disciplinary research program in computational journalism. Under this program, we have also extensively worked on several other projects related to mitigating online misinformation. This talk will present a high-level overview of these projects, followed by a brief discussion of several directions of ongoing efforts.
Speaker: Chengkai Li
Bio: Dr. Chengkai Li is a Professor in the Department of Computer Science and Engineering at the University of Texas at Arlington. His research interests span several areas related to big data intelligence and data science, including databases, data mining and applied machine learning, natural language processing, and their applications in computational journalism. Particularly, his ongoing research projects include data-driven fact-checking, exceptional fact finding, and usability challenges in querying and exploring knowledge graphs. His publications received several awards at SIGMOD, VLDB, and CIDR. Dr. Chengkai Li received his Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign. He graduated from Nanjing University with an M.Eng. degree and a B.S. degree in Computer Science.
Invited talk: Fact-checking on encrypted chat applications
Abstract: In this talk, I will present our work on fact-checking on WhatsApp. First, I will talk about methods we developed to collect data from WhatsApp at scale. Next, I will talk about research done using such data, and how it differs from other open social media platforms like Twitter, Facebook or Reddit. Finally, I will talk about our approach on building pipelines to fact checking content in an end to end encrypted setting. I will end with a vision of challenges and opportunities in this space.
Speaker: Kiran Garimella
Bio: Kiran Garimella’s research deals with using large-scale data to tackle societal issues such as misinformation, political polarization, or hate speech. Prior to joining Rutgers, Dr. Garimella was the Michael Hammer postdoc at the Institute for Data, Systems and Society at MIT. Before joining MIT, he was a postdoc at EPFL, Switzerland. His work on studying and mitigating polarization on social media won the best student paper awards at top computer science conferences. Kiran received his Ph.D. in computer science at Aalto University, Finland, and Masters & Bachelors from IIIT Hyderabad, India. Prior to his Ph.D., he worked as a Research Engineer at Yahoo Research, Barcelona, and QCRI, Doha.
Recent Updates
09 May 2021: Deadline for Task 3 is extended until Monday, 10th of May.
01 May 2021: Leaderboard for Task1A-Spanish is released.
01 May 2021: Test input data for all subtasks of Task 1 and Task 2 are released.
01 May 2021: Task-3 leaderboard and submission site released.
01 May 2021: Training data for subtask-3b is released
30 April 2021: Second batch of training data for subtask-3a is released.
26 April 2021: Task-2 leaderboard and submission site released.
26 April 2021: Task-1 leaderboard and submission site released.
21 April 2021: English first batch of training data for subtask-3a is released (please don't forget to send the data sharing agreement)
07 April 2021: English appetizer data for subtask-3a is released (training data will follow next week).
06 April 2021: English training data for subtask-2a is released.
17 March 2021: Arabic training data for subtask-2a is released.
13 March 2021: Arabic training data for subtask-1a is released.
03 March 2021: Turkish training data for subtask-1a is released.
23 February 2021: Bulgarian training data for subtask-1a is released.
22 February 2021: English training data for subtask-2b is released.
07 February 2021: English training data for subtask-1b is released.
07 February 2021: English training data for subtask-1a is released.
19 January 2021: Spanish training data for subtask-1a is released.
October 8, 2020: Website is up!
Lab Registration
To register in the CheckThat! lab, please visit: http://clef2021-labs-registration.dei.unipd.it/registrationForm.php
Leaderboard and Submission sites
To submit and view your results on the test and dev data, kindly use the following leaderboard:
Tasks
Check the overview of the tasks here or in the following task specific task pages.
Task 1 - Check-Worthiness Estimation : Given a claim, detect whether it is worth fact-checking.
Datasets
Task 2 - Verified Claim Retrieval: Given a check-worthy claim, and a set of previously fact-checked claims, determine whether the claim has been previously fact-checked.
Datasets:
Task 3 - Fake News Detection: Given the text of a news article, determine whether the claims made in the article are true, partially true, false or other (e.g., claims in dispute) and also detect the topical domain of the article.
Datasets:
You can also check out the main repo containing all the scripts and data, CheckThat! Lab-2021.
Important Dates
All times are Any where On Earth (AOE).
16 November 2020: Registration opens
30 April 2021: Registration closes
7 May 20218 May 2021: End of Evaluation Cycle28 May 2021: Submission of Participant Papers [CEUR-WS]
Discussion Group
Please join our discussion group clef-factcheck@googlegroups.com to receive announcements and participate in discussions.
Organizers
Preslav Nakov, Qatar Computing Research Institute, HBKU
Giovanni Da San Martino, Qatar Computing Research Institute, HBKU
Tamer Elsayed, Qatar University
Alberto Barrón-Cedeño, Università di Bologna
Rubén Míguez, Newtral Media Audiovisual, Spain
Firoj Alam, Qatar Computing Research Institute, HBKU
Shaden Shaar, Qatar Computing Research Institute, HBKU
Maram Hasanain, Qatar University
Fatima Haouari, Qatar University
Nikolay Babulkov, Sofia University
Alex Nikolov, Sofia University
Thomas Mandl, University of Hildesheim
Julia Maria Struß, University of Applied Sciences Potsdam
Gautam Kishore Shahi, University of Duisburg-Essen
Sandip Modha, LDRP Institute of Technology and Research
Mucahid Kutlu, TOBB Economy and Technology University
Yavuz Selim Kartal, TOBB Economy and Technology University
Citation
You can find the overview papers on the CLEF2021-CheckThat! Lab as well as of the individual tasks below:
@InProceedings{CheckThat:ECIR2021,
author = {Preslav Nakov and
Da San Martino, Giovanni and
Tamer Elsayed and
Alberto Barr{\'{o}}n{-}Cede{\~{n}}o and
Rub\'{e}n M\'{i}guez and
Shaden Shaar and
Firoj Alam and
Fatima Haouari and
Maram Hasanain and
Nikolay Babulkov and
Alex Nikolov and
Shahi, Gautam Kishore and
Struß, Julia Maria and
Thomas Mandl},
title = {The {CLEF}-2021 {CheckThat}! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News},
booktitle = {Proceedings of the 43rd European Conference on Information Retrieval},
series = {ECIR~'21},
pages = {639--649},
address = {Lucca, Italy},
month = {March},
year = {2021},
url = {https://link.springer.com/chapter/10.1007/978-3-030-72240-1_75},
}
@InProceedings{clef-checkthat:2021:LNCS,
author = {Preslav Nakov and
Da San Martino, Giovanni and
Tamer Elsayed and
Alberto Barr{\'{o}}n{-}Cede{\~{n}}o and
Rub\'{e}n M\'{i}guez and
Shaden Shaar and
Firoj Alam and
Fatima Haouari and
Maram Hasanain and
Watheq Mansour and
Bayan Hamdan and
Zien Sheikh Ali and
Nikolay Babulkov and
Alex Nikolov and
Shahi, Gautam Kishore and
Struß, Julia Maria and
Thomas Mandl and
Mucahid Kutlu and
Yavuz Selim Kartal},
title = "Overview of the {CLEF}-2021 {CheckThat}! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News",
year = {2021},
booktitle = "Proceedings of the 12th International Conference of the CLEF Association: Information Access Evaluation Meets Multiliguality, Multimodality, and Visualization",
series = {CLEF~'2021},
address = {Bucharest, Romania (online)},
url ={https://link.springer.com/chapter/10.1007/978-3-030-85251-1_19}
}
@InProceedings{clef-checkthat:2021:task1,
author = {Shaden Shaar and
Maram Hasanain and
Bayan Hamdan and
Zien Sheikh Ali and
Fatima Haouari and
Alex Nikolov,
Mucahid Kutlu and
Yavuz Selim Kartal,
Firoj Alam and
Da San Martino, Giovanni and
Alberto Barr{\'{o}}n{-}Cede{\~{n}}o and
Rub\'{e}n M\'{i}guez and
Tamer Elsayed and
Preslav Nakov},
title = "Overview of the {CLEF}-2021 {CheckThat}! Lab Task 1 on Check-Worthiness Estimation in Tweets and Political Debates",
year = {2021},
booktitle = "Working Notes of CLEF 2021---Conference and Labs of the Evaluation Forum",
series = {CLEF~'2021},
address = {Bucharest, Romania (online)},
url={http://ceur-ws.org/Vol-2936/paper-28.pdf}
}
@InProceedings{clef-checkthat:2021:task2,
author = {Shaden Shaar and
Fatima Haouari and
Watheq Mansour and
Maram Hasanain and
Nikolay Babulkov and
Firoj Alam and
Da San Martino, Giovanni and
Tamer Elsayed and
Preslav Nakov},
title = "Overview of the {CLEF}-2021 {CheckThat}! Lab Task 2 on Detecting Previously Fact-Checked Claims in Tweets and Political Debates",
year = {2021},
booktitle = "Working Notes of CLEF 2021---Conference and Labs of the Evaluation Forum",
series = {CLEF~'2021},
address = {Bucharest, Romania (online)},
url={http://ceur-ws.org/Vol-2936/paper-29.pdf}
}
@InProceedings{clef-checkthat:2021:task3,
author = {Shahi, Gautam Kishore and
Struß, Julia Maria and
Thomas Mandl},
title = "Overview of the {CLEF}-2021 {CheckThat}! Lab Task 3 on Fake News Detection",
year = {2021},
booktitle = "Working Notes of CLEF 2021---Conference and Labs of the Evaluation Forum",
series = {CLEF~'2021},
address = {Bucharest, Romania (online)},
url={http://ceur-ws.org/Vol-2936/paper-30.pdf}
}