Although the data we attained and showcased in visualizations 1 and 2 were extremely important for our findings and questions, some factors could have influenced the data when making the datasets. One such factor is the absence of context regarding the possibility of social/economic factors and healthcare access that could have influenced the way people approached vaccination(public perception). Another factor could be the way the data was collected. Since this dataset accounts for the entire state of California, there could have been many variations when recording the data presented since each county’s method is different from one another, ultimately affecting the accuracy during the collection. In addition, the categories for each group seemed to be so generalized that it oversimplified that it seems to have mixed the data of the sub-group within each ethnicity.
The data set used was taken from the study by Na, Ling et al., “Racial and ethnic disparities in COVID-19 vaccine uptake: A mediation framework.” The bar chart representing the first-dose vaccination rates for each race was obtained from this study. The data was collected through an online survey over all the waves of the pandemic, and data was collected from around 6000 respondents every two weeks (until wave 24). The data represented on the chart shows the proportion of the people of each race who have received at least one dose of vaccine by April 2021. A potential source of bias could be that the number of White participants was significantly larger than any other race, which could lead to some bias. Furthermore, it is interesting to see that while there are large disparities in first-dose vaccinations, eventually, this evens out between races with time. The numbers shown on the bar chart were consistent with first-dose vaccination levels from April 2021. However, by November of that year, first-dose vaccine uptakes had evened out to around 78%-81% among White, Black, and Hispanic Americans.
These datasets include the number of vaccinations administered and the number of deaths attributed to COVID-19, while including the demographic information of the person that died or was vaccinated.
These two visualizations expose disproportionate effectiveness of vaccines between different ethnicities by exposing the interethnic disparities in COVID-19 deaths rates and vaccine rates. Data for this research was sourced from the National Center for Health Statistics (NCHS), focusing on the time frame from January 2021 to May 2023. This time period was selected because it marks the time when vaccinations began to be administered nationwide. In order to effectively compare the mortality and vaccination rates and to see the effects of vaccination rates on mortality rates, the same time frame was used for both visualizations.
To create these visualizations, two separate datasets created by NCHS were loaded into Jupyter Notebook (a programming platform) as csv files. One dataset included the demographic information about the vaccinations in the United States and the other included demographic information about COVID-19 mortalities in the United States. We focused on ethnicity in relation to COVID-19 in the United States, so we deleted all entries missing the ethnicity of the person that died or was vaccinated. Using the programming language Python, we created a visualization with respect to time for each data set.
While the dataset provides valuable insights, it also has several limitations. First, it only includes data from individuals who reported their ethnicity, which may lead to an underrepresentation of certain groups. Additionally, the dataset does not account for other factors that could influence health outcomes, such as pre-existing health conditions, socioeconomic status, or geographic location. These limitations may obscure the full extent of health disparities and the underlying causes. The same forces behind individuals choosing not to disclose their ethnicities may correlate with discriminative forces against these individuals in healthcare, so a more robust study may focus on individuals who have not disclosed their ethnicity to find markers correlated with discrimination experienced by other groups.
Moreover, the data does not capture the nuances of vaccine distribution and access. For instance, it does not include information on the availability of vaccines in different regions, the effectiveness of public health campaigns targeting minority communities, or the logistical challenges faced by these communities in accessing vaccination sites. These factors are crucial for understanding the variability in vaccination rates and the persistent health disparities. Without such data, we cannot make strong claims about the factors behind disparate COVID-19 vaccination and death rates.
It would be against the goal of our project to reveal these forms of discrimination if we did not clarify the discrepancies in the data themselves across different races. In particular, most of our later analyses refer only to four racial groups: White, Asian, African-American, and Hispanic. These categories notably leave out populations native to the American continent, as data on Native Americans and Pacific Islanders are underreported (Supica et al. 1). As for data that do exist, they at times run counter to testimonial evidence and on-the-ground experiences, such as Pacific Islanders appearing to have an extremely low number of deaths throughout the pandemic. So, for the robustness of the analysis, we decided to focus on the four racial categories mentioned above to highlight discrepancies. In fact, this limitation in our dataset may be viewed through a structuralist perspective; the same forces that have led to Native Americans’ and Pacific Islanders’ land being removed from the map, including negligence and profit-motivation, have led these same groups from not being counted in our data, and effectively not counting at all in front of society. So, future studies may look more into how these populations’ unique experiences with COVID-19 and healthcare as a whole may be better represented through data, or into the reliability of existing COVID-19 data.