The WHO dataset is a thorough collection of health data from different countries and income levels. This dataset contains important data on healthcare access, levels of medical staff, rates of obesity and underweight, vaccination rates, and factors related to environmental health. The dataset is categorized based on income groups outlined by the World Bank, including high-income, upper-middle-income, lower-middle-income, and low-income groups. It also includes demographic information and various health indicators crucial for studying worldwide health disparities.
The dataset contains a variety of important health indicators and demographic variables in abundance. Firstly, it outlines the levels of healthcare staff by giving the ratio of medical staff to population. This information is crucial for grasping the distribution of healthcare workers among various income brackets and determining if areas have enough healthcare staff to fulfill their requirements. Additionally, it contains information on obesity and underweight rates, categorized by income brackets. This data illustrates variations in nutrition and lifestyles across different populations, indicating larger socio-economic patterns that impact health.
Additionally, the dataset provides in-depth vaccination rates for different illnesses, segmented by income bracket. This information aids in comprehending the accessibility of preventive healthcare services among various income brackets and evaluating the success of public health initiatives. Moreover, environmental health aspects like water sanitation levels and pollution are considered, though they may not always be divided by income group. Understanding the broader environmental impacts on health and their variation across socio-economic levels depend heavily on these key factors.
The EDA carried out on this dataset at the beginning offers important insights into differences in healthcare access based on income levels. It shows that wealthier areas have a greater number of medical personnel in total, suggesting an unequal distribution of resources that benefits affluent areas. The examination demonstrates a correlation between higher income and higher obesity rates. Obesity rates have increased for all income levels, but the rate of increase has slowed down in affluent areas. This narrowing might indicate that individuals with higher incomes have improved availability of healthcare and dietary resources for controlling obesity.
Reduced rates of underweight have been observed in all income brackets over time, showing general enhancements in nutritional health. The information indicates that wealthier groups tend to have higher vaccination rates for certain diseases compared to lower-income groups, highlighting inequalities in access to preventive healthcare services.
Even though the WHO dataset has many advantages, it also has various constraints that need to be recognized. Condensing datasets into broad income groups can lead to a significant loss of context. This grouping overlooks key details like differences in healthcare methods (e.g. universal healthcare vs. insurance-based systems) and socio-political influences on healthcare access and quality. Various nations have diverse healthcare systems and policies that greatly influence healthcare results, and these variances are not reflected in the general income classifications utilized.
Furthermore, the World Bank's global income definitions do not consider income inequality within countries. For example, labeling the United States as a high-income nation does not accurately represent the large income gaps and resulting healthcare accessibility challenges present in the country. The dataset's lack of detail hinders its ability to accurately show disparities in healthcare and create specific interventions for low-income populations and women.
Another important constraint is the insufficient information on healthcare facilities. The dataset does not have specific details about the ratio of hospitals per 10,000 individuals and the sufficiency of healthcare resources among different income brackets. This data is essential for evaluating healthcare facilities and capabilities, especially in disadvantaged regions with limited resources. It is difficult to assess the quality of healthcare services and pinpoint areas needing improvement without this information.
Moreover, the dataset lacks adequate information on illnesses resulting from environmental factors like water sanitation and pollution, categorized by income group. Environmental factors have a notable impact on health results, particularly for individuals with low socioeconomic status who tend to reside in regions with inadequate sanitation and elevated pollution levels. The absence of information on these variables obstructs the ability to pinpoint environmental health inequities and create tailored interventions.
Ultimately, the dataset fails to include gender-specific information, which is essential for comprehending the healthcare inequities experienced by women. Females, especially those from low socioeconomic status backgrounds, frequently encounter distinct healthcare obstacles such as reproductive health concerns, maternal mortality, and gender-based violence. It is challenging to effectively identify and address these specific needs without data that is broken down by gender.
The second part of our analysis included analyzing the rates of Covid vaccinations across different socio-economic groups, which we defined through the median incomes associated with different Zip Codes. For this analysis, we used publicly accessible data to analyze the relationship between vaccinations and socio-economic status in California. The datasets used for analyzing Covid vaccinations were sourced from government websites; the income data for Zip Codes was sourced from the official Census website, and the vaccination data was sourced from the California Department of Public Health.
The Covid Vaccination dataset from the California Department of Public Health includes information based on California Zip Codes about Covid Vaccination rates. The data includes the percentage of people who have been vaccinated at least one time and those who have been fully vaccinated in each California Zip Code.
The Zip Code information sourced from the U.S. Census includes the median and mean incomes for each California Zip Code, as well as more detailed statistics such as median incomes for family and non-family units.
The primary finding from the dataset was that there is a correlation between median income for a Zip Code and percent of individuals who have been vaccinated at least once. The correlation coefficient between percent of people in a Zip Code who have been vaccinated at least once and median income was calculated as 0.478, which demonstrates a strong positive correlation between these two quantitative variables.
Overall, these two data sources combined provide a useful way to visualize the relationship between socio-economic status and vaccination accessibility. The primary limitation from utilizing this data is that we have to define socio-economic status by median income for a Zip Code, which does not take into account the possibility that some Zip Codes may be very diverse in economic status, with high and low-earning groups. Additionally, the metric we used to study people's access to healthcare also comes with limitations. The percentage of people who have been vaccinated for Covid in each different area may be influenced by factors other than accessibility to healthcare, such as political affiliation, preference, or belief systems.