To get the datasets please send the scanned version of the duly filled-in "organizational-access form" found here to the email address irmidis.fire2022@gmail.com. Please mention "IRMiDis FIRE 2022" somewhere in the form. Please mention the name of the team, and the name, affiliation and email id of each participant in the email.
Task 1: COVID-19 vaccine stance classification from tweets
Training data: We will be providing around 4,390 tweets as training data from two sources-
Cotfas dataset: Cotfas et. al. provided a dataset containing stances of tweets towards COVID-19 vaccines, crawled between November-December 2020. From this dataset, we will providing 2,792 crawled tweet-texts along with the tweet-IDs and the labels (pro-vax, neutral, anti-vax). This dataset will form the first part of the training dataset. The original dataset can be found at https://github.com/liviucotfas/covid-19-vaccination-stance-detection.
Our dataset: We (the organizers) crawled tweets between March-December 2020 with various vaccine-related keywords. We got tweets annotated with the three labels by three crowdworkers. For 1600 tweets, there was at least majority agreement among the crowdworkers. We provide these 1600 tweets (tweet IDs, tweet texts, classes) as the second part of the training set.
Test data: We shall release a test set of 500 tweets, providing only the tweet IDs and the tweet-texts for them.
Task 2: Detection of COVID-19 symptom-reporting in tweets
We crawled English tweets from February 2020- June 2021 using keywords related to COVID-19 symptoms (e.g., ‘fever’, ‘cough’). We took a random sample from our collected set of tweets and got about 2K tweets annotated into the four classes by human workers. We shall split this annotated set of tweets in the ratio 80%-20% and release them as train and test sets.
Note that, for this task, every tweet in both the train and test datasets will be such that it contains mention of some common COVID-19 symptoms; hence simply checking for the presence of such symptom-words will not be sufficient.