ICWSM 2022 Data Challenge: Health-Related Discourse

June 6, 2022 (Virtual)

ICWSM 2022 is hosting the third ICWSM data challenge with the goal of bringing together researchers to analyze and understand emerging societal issues. The data challenge is a space where researchers can exchange ideas, discuss ongoing work, and foster collaboration, grounded on open data. This year’s data challenge theme is Health-Related Discourse on the Web.

Due to the ongoing COVID-19 pandemic, we are experiencing a worldwide transition to internet spaces for everyday needs and activities. This massive increase in online interactions is accompanied by growing volumes of antisocial behavior like harassment, hate speech, and disinformation, especially related to health issues, the origin/spread of the virus, and countermeasures that tackle the pandemic (e.g., the effectiveness of vaccines). These behaviors can cause substantial real-world harm, hence it is imperative for our research community to foster advances on this problem. Towards this goal, this year’s data challenge is encouraging submissions that provide solutions to combat these challenges and to make the Web a safer place for online discourse, especially on health-related topics.

Program (all times EDT)

[8:30 - 8:45] Introduction

[8:45 - 9:30] Keynote Talk - Meeyoung Cha (Associate Professor at KAIST & IBS, Korea)

Title: Facts Before Rumors: Reducing the Impact of Misinformation

Abstract: During the early days of the pandemic, we launched an online campaign to debunk COVID-19 rumors that disseminated accurate coronavirus-related information to over 50,000 individuals in 151 countries. The campaign aimed to collect fact-checked information from regions that had already suffered from the infodemic and spread them to other regions where the infodemic was at its infancy. Alongside our campaign, we conducted a series of research projects to understand what kind of coronavirus-related information was being shared online. Focusing on misinformation, we quantified the spread of COVID-19 misinformation through survey studies. This talk will introduce our Facts Before Rumors campaign (https://ibs.re.kr/fbr/) as well as other on-going efforts related to fact-checking health misinformation.

[9:30 - 9:40] Break + Breakout Rooms

[9:40 - 11:00] Papers Session - News, Rumors, and Misinformation

  • Know It to Defeat It: Exploring Health Rumor Characteristics and Debunking Efforts on Chinese Social Media during COVID-19 Crisis

Wenjie Yang (HKUST), Sitong Wang (Columbia University), Zhenhui Peng, Chuhan Shi. Xiaojuan Ma (HKUST), and Diyi Yang (Georgia Institute of Technology)

  • FaCov: COVID-19 Viral News and Rumors Fact-Check Articles Dataset

Shakshi Sharma (University of Tartu), Ekanshi Agrawal (BITS Pilani), Rajesh Sharma (University of Tartu), and Anwitaman Datta (Nanyang Technological University)

  • The impact of online misinformation on the COVID-19 vaccination campaign in the United States

Francesco Pierri (Politecnico di Milano), Brea Perry (Indiana University Bloomington), Matthew DeVerna (Observatory on Social Media / Indiana University), Kai-Cheng Yang (Indiana University), Alessandro Flammini, Filippo Menczer (Indiana University Bloomington) and John Bryden (Indiana University)

  • Local News Online and COVID in the U.S.: Relationships among Coverage, Cases, Deaths, and Audience

Benjamin Horne (University Tennessee Knoxville), Kenneth Joseph (University at Buffalo), Jon Green, John Wihbey (Northeastern University)

[11:00 - 11:10] Break + Breakout Rooms

[11.10 - 11.45] Papers Session - Stance and Bias

  • VaxxStance: A Dataset for Cross-Lingual Stance Detection on Vaccines

Rodrigo Agerri (University of Basque Country), Roberto Centeno, Maria Espinosa (UNED), Joseba Fernandez de Landa (University of Basque Country), and Alvaro Rodrigo (UNED)

  • Suffering from Vaccines or from Government? : Partisan Bias in COVID-19 Vaccine Adverse Events Coverage

Taeyoung Kang (KAIST) and Hanbin Lee (Seoul National University)

[11:45 - 11:50] Break + Breakout Rooms

[11.50 - 12.25] Papers Session - Misc. Effects

  • "This Candle Has No Smell": Detecting the Effect of COVID Anosmia on Amazon Reviews Using Bayesian Vector Autoregression

Nick Beauchamp (Northeastern University)

  • Linguistic Analysis of User Disclosures about Smoking Addiction during COVID-19 via Reddit

Nay Zaw Aung Win and Lydia Manikonda (Rennselaer Polytechnic Institute)

[12.25 - 12.30] Closing Remarks

Important Dates

Data challenge announcement: February 15, 2022

Paper submission deadline: April 1, 2022 April 8, 2022 (AoE)

Paper notifications: May 1, 2022

ICWSM 2022 Data Challenge Workshop day: June 6, 2022 (Virtual)


We invite papers that model, analyze, and contribute towards understanding and mitigating challenges in facilitating online discourse during the COVID-19 pandemic. We invite papers working on the following, but not limited to, topics:

  • Detecting health-related abuse and misinformation on the Web (e.g., detection of health-related misinformation or targeted hate speech during the COVID-19 pandemic).

  • Understanding and quantifying the spread of health-related abusive/misinformative content on the Web.

  • Exploring patterns (and differences) in online discourse and narratives related to the COVID-19 pandemic across different communities, demographics, and countries.

  • Identifying demographic groups that are the targets of Health-related abusive/misinformative behavior (e.g., how often are people from specific countries or backgrounds targeted online with hate speech because of COVID-19 variants?).

  • Approaches to promote desirable behavior like prosociality, norm compliance, and social support to counter Health-related abusive/misinformation discourse.

  • Evaluating the effectiveness of commonly-used (positive) interventions, in addition to developing new interventions, in mitigating the spread of Health-related abusive/misinformative discourse on the Web.


We invite participants to work on the following open datasets. The datasets were selected because they are relevant to this year’s data challenge theme, relatively new, and broad thus enabling participants to answer multiple interesting questions. You are welcome to use other open datasets, assuming that they are of high quality, publicly available, and relevant to this year’s data challenge theme.

  • "COVID-19 Coverage By Cable and Broadcast Networks": Paper, Dataset

  • "Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms": Paper, Dataset

  • "CoVaxxy: A Collection of English-Language Twitter Posts About COVID-19 Vaccines": Paper, Dataset

  • "Tracking Social Media Discourse about the COVID-19 pandemic: Development of a Public Coronavirus Twitter Data Set": Paper, Dataset

  • COVID-19 Pandemic Wikipedia Readership: Dataset

  • "The Pushshift Reddit Dataset": Paper, Dataset

  • "Media Cloud: Massive Open Source Collection of Global News on the Open Web": Paper, Dataset

  • Data about COVID-19 related articles across Wikipedia projects: Dataset

Submission Instructions

We invite short papers (4 pages including references) as well as poster papers (2 pages including references). Submissions should follow the formatting guidelines for ICWSM-2022. Submissions should be anonymous and conform to AAAI standards for double-blind review.

Submissions are non-archival: we welcome already published works and work-in-progress (in publication or submission elsewhere, or ongoing research) papers. Authors of selected top papers will be given the option to have their paper included in the ICWSM Workshop proceedings.

Submission site: https://easychair.org/conferences/?conf=icwsmdatachallenge22

Data Challenge Chairs

Eshwar Chandrasekharan, Assistant Professor, University of Illinois at Urbana-Champaign

Miriam Redi, Research Manager, Wikimedia Foundation

Savvas Zannettou, Assistant Professor, Delft University of Technology