Topic Modeling is essentially performed to discover abstract “Topics” that are underlying in the Tweets data. Latent Dirichlet Allocation (LDA) was employed to identify key themes from the dataset. Given the computational constraints and the recognized limitations of perplexity as a sole determinant of topic number we selected 20 topics and labeled them manually for further exploration. Topics included: 1.) COVID-19 healthcare responses; 2) Personal experiences with the virus; 3) Criticism of government policies; 4) Vaccination efforts. The 20 topics are as follows:
A geographical analysis revealed a noteworthy pattern across all regions—North, South, Center, and Islands—indicating that discussions surrounding vaccinations and treatments peaked concurrently with the widespread availability of vaccines.
Looking at the bar graph in the image, I can provide a detailed description as requested:
In the Islands region, topics such as "COVID-19 and public events/gatherings" (light blue) and "COVID-19 outbreaks in specific locations" (orange) have a higher mention based on the bar graph in Figure 6, as compared to the other regions. This indicates that discourse regarding these topics were notably more pronounced. This likely reflects their reliance on tourism and tighter community networks, amplifying concerns about superspreader events.
The discourse in the Center region, particularly in urban areas like Rome and Florence, showed a higher volume of tweets related to the topic of “COVID-19 social and economic impacts” compared to other regions. This indicates a distinct emphasis on the socioeconomic consequences of the pandemic, as evidenced by the topic distribution in Figure 7. The graph also reveals that socioeconomic discussions in the Center often coincided with spikes in ‘general updates,’ such as during national lockdown announcements. Additionally, tweets related to topics like ‘testing/positive cases’ and ‘skepticism/conspiracy theories’ remained consistently high, suggesting that Twitter users in these urban areas were actively engaged in discussions aimed at explaining both the consequences of the pandemic and the surrounding theories.
The data in the bar graph reveals significant patterns, including a major spike in general COVID-19 updates around December 2020, reaching nearly 30%, and the growing prominence of skepticism and conspiracy theories over time, peaking at approximately 25% by June 2021. Vaccine and treatment discussions became more prevalent in late 2020 and early 2021, during the vaccination campaigns. While all topics show monthly fluctuations that reflect shifting public interests. This region has the highest population in the dataset, which explains the diverse range of COVID-19 topics being discussed and their varying intensity throughout the pandemic period.
The dominant topic throughout the period is "COVID-19 statistics and data," consistently maintaining the highest percentage of tweets each month. Other notable topics, such as "COVID-19 skepticism and conspiracy theories" and "COVID-19 outbreaks and case numbers," fluctuate in prominence but generally rank below statistics and data. Topics like "COVID-19 vaccines and treatments" and "COVID-19 restrictions and quarantine rules" show varying degrees of activity, with some spikes during specific months, reflecting the lockdown policies and the vaccination campaigns; possibly indicating changing public concerns or pandemic developments. Overall, the focus appears to shift slightly among secondary topics, but the primary emphasis remains on data and statistics.