Births per Woman vs. Child Mortality
Joseph Wheeler, Will Hauser, Courtney Cruickshank
PSYC 500 final project
Joseph Wheeler, Will Hauser, Courtney Cruickshank
PSYC 500 final project
Introduction
Our data describes average number of children per woman of child-bearing age to the child mortality rates. This data contains data for over 200 countries around the world from the year 1950 to 2020.
We can see trends in the graphs from the website, but we’ll need to do some in-depth analyses to see what’s really going on
Child mortality and Birth rates are two of the most important factors when exploring global health trends. It is also critical to have an understanding of births and deaths to have a relatively accurate world population count. What we seek to find is whether or not a correlation exists between child mortality rates and births per woman. We think that exploring the nature of this correlation could reveal if we, as a global society, have improved in our ability to reduce childhood deaths and whether birth rates among women could be a predictive pattern of this trend. This is not to say we expect a causative relationship, but that the two variables are strongly correlated, and the strength of that correlation may change over time.
Previous studies on this topic:
The United Nations analyzed birth rates globally and found that the continents of Africa and South America have historically higher birth rates than the United States. They also found that global birth rates are declining overall (world-fertility-patterns-2015.pdf (un.org)).
WHO has found that percentage of child mortality rates have been decreasing over time, but that the quantitative number of preventable child deaths still remains relatively high Child mortality (who.int)
Data Curation and Ethics
Data Curation:
The data used in this project was curated from the organization Our World in Data. This organization provides empirical evidence related to specific topics in global development. On their website, they explain that the reason they conduct this work is to show global transformation via trends over time. The media's tendency to cover specific events is valuable, but neglects the progress the world has made over time.
Data range:
Our variables of interest begin during 1950 and extend until 2020
We've included 2020 in our range of years for consideration, though we will indicate that the data from this year may not be complete during time of collection.
Ethical Implications:
Public good outweighs individual privacy.
This data was freely available online. As stated on website: "Our World in Data is created as a public good. All data is available for download.”
Regarding collection practices, it is necessary to ensure that subjects don't feel any coercion to share info they aren't comfortable with sharing.
Research Questions:
Firstly, does a linear relationship exist between child mortality rates and number of births per woman? (Establishing correlational relationship, legitimizing hypothesis).
If so, how does this relationship change over time, and what -if any- trend is present? (Analyses for hypothesis test).
Are there any countries that are outliers from this trend, and what real world factors might impact this?
If we contrast two countries with different access to medical care throughout the years, can we elucidate how real world factors impact this relationship? (Post-hoc analyses, beyond scope of hypothesis).
Hypotheses:
Research hypothesis: The relationship between child mortality and births per woman will strengthen over time as indicated by increasing R-squared values, or fitness of linear regression. Ha: [ R^2f > R^2i ]
Null hypothesis: The relationship between child mortality and births per woman will weaken or remain unchanged over time as indicated by increasing R-squared values, or fitness of linear regression. Ho: [ R^2f <= R^2i ]
Data Preparation and Exploration
Variables of Interest
Country: The geographical region of interest defined by a soverign-identifying government or legislative body
Year: The calendar year that the data was recorded (We focused on every decade starting with 1950)
Total Population: The amount of people living in each particular country at the time
Child Mortality: The amount of deaths of children from age 0-5 during the particular year per 1000 children
Births per Woman: Follows the total fertility rate for the given year (the average number of children a women would have assuming that current age-specific birth rates remain constant throughout her childbearing years)
Data frame was reshaped so that it only included the data by decade starting with 1950; A combined dataframe was created after subsetting the data so that it only showed the births per woman, mortality rate, and year (see Codes below)
Shape: For the subsetted data frame that shows data by decade, there were 2225 data points measured (around 8 data points per country)
Post data visualization and describe insights from data visualization.
Model Building and Validation
Analyses of Normality of Probability Distribution
We found that neither of our continuous variables (births per woman and child mortality) were not normally distributed, as shown in the plots below. This is probably due to the difference in normality and culture in each individual country. The data is expected to be skewed due to extreme values from certain countries where multiple births may be more common.
Here is the code we used to assess normality:
Bootstrap hypothesis test of global mean child mortality by year
In this bootstrap analysis, we separated the data by year, then sampled the global child mortality at the top of each decade. In usual bootstrapping method, the data was shuffled and 10000 samples were taken per year of interest. This distribution was graphed and each year color coded.
What is revealed is that child mortality rates not only decreased as time moved forward, but also became more concentrated in their spread. This indicates that there was less variability in the data regarding the number of childhood deaths per year which was strongly indicative that our hypothesis about condensing correlations was on the right track.
Bootstrap hypothesis test of global mean births per woman per year
The same bootstrapping technique was applied to births per woman across the same years of interest, and a similar pattern emerged. While not as strong of a trend as child mortality, it was seen visually that birth rates we declining over all and the distribution of number of births was condensing in a similar fashion as child mortality rates.
Linear Regression Analyses
We conducted linear regression analyses with our continuous variables (births per woman and child mortality) for the top of each decade included in our data (1950, 1960, 1970, etc.) See plots below.
It appears that the relationship between births per woman and child mortality is strengthening over time. To examine this empirically, we plot the R^2 value for each regression in the chart below.
Indeed, the positive slope of the R^2 values over time indicates that the relationship between child mortality and births per woman is strengthening over time.
Permutation hypothesis test of difference of means between United States and Africa in births per woman
We see that there is a significant difference between the means of births per woman in the United States compared to Africa, indicating that these are distinct populations with regards to births per woman.
Permutation hypothesis test of difference of means between United States and Africa in child mortality
There is also a significant difference between the means of child mortality in the United States compared to Africa, indicating that these are distinct populations with regards to child mortality.
Discussion
Summary:
The objective of this project was to identify if a correlation existed between child mortality and birth rates, and if that relationship strengthened over time. The data was curated from the organization 'Our World in Data' whose data is open and available to the public. To prepare this data, it was cleaned by removing rows for countries who did not report values for both child mortality and births per woman in a particular year. Then, the data was refined to a range of years 1950 to 2020 and sorted into decades, then separately sorted by geographic region. Early in the exploration, it was confirmed through analysis of linear regression that a relationship indeed did exist between child mortality and birth rates per woman. By calculating the R-squared values for each linear regression generated for the first year of the decade, it was discovered that the R-squared value increased over time, indicating a strengthening correlation between our two continuous variables. Bootstrap samples of the means of child mortality and births per woman were taken from the years measured which revealed a coalescing trend towards lower birth and child mortality rates, giving credence to our earlier findings. To confirm that the data came from statistically different populations, two geographic areas of interest were analyzed via permutation hypothesis testing. It was found that the regions of Africa and the United States had statistically significant differences and that their empirically calculated means were nearly impossible to happen upon if the samples came from the same population.
Informed Insights:
Relationship between child mortality and births per woman is strengthening over observed time frame (1950-2020).
We can confidently reject our null hypothesis, as it has been proven untrue through our analysis.
Child mortality rates and births per woman are both decreasing over time frame.
When global populations are simulated within this time frame, a trend emerges that supports our empirical findings.
These geographical areas of interest are not identical in their distribution of child mortality and birth rates.
Future Directions:
Causative factors of child mortality should be explored by future studies. Based on our late-stage investigations of the differences between child mortality rates in Africa and the United States, there is significant evidence that geopolitical factors play a role in a child's welfare. According to the Lancet, 49% of all child mortalities in 2008 occurred in just 5 countries, 2 of which are included in the continent of Africa {Nigeria & DRC} (Global, regional, and national causes of child mortality in 2008: a systematic analysis - ScienceDirect). This is strongly indicative that access to medical care in these countries may be a determinate factor in a child's ability to survive to adulthood.
Factors that effect birth rate should also be explored. Factors like freedom of access to contraceptive health care and changing cultural attitudes towards the rights of women are likely variables that could impact birth rates among populations.
Correlations between birth rates and environmental pollutants could also be explored, as freedom of choice might not be the only factor impacting birth rates.
Implications:
Our analyses show that both births per woman and child mortality are decreasing over time and that the relationship between these two variables is strengthening over time. We are unable to make any causal claims about the nature of this relationship; however, if public policy works to decrease birth rates, it is likely that child mortality rates will decrease as well. Public policies regarding factors such as increased access to contraceptive birth rates and cultural attitudes towards female reproductive rights may help to decrease child mortality. If there is a failure to decrease birth rates through public policy, there could still be increased community support towards mothers with multiple children to further increase prevention of child mortality. The fact that the relationship between births per woman and child mortality is strengthening over time is unfortunate, and further community efforts could help get rid of the stigma that more births will lead to increased mortality.