Can faith predict early pregnancy behaviour?
The Amazon Case (Part 1)
The Amazon Case (Part 1)
Well, If you believe that religiosity stimulates early fertility, it is reasonable to assume that your hypothesis describes a positively linear relationship: the higher the levels of religiosity, the higher the odds of early childbearing. Conversely, some people might think that the opposite is true: higher levels of religiosity lead to lower odds of early childbearing. In this case, the relationship would be negatively linear. You can also think that all this conversation is nonsense. I might be tripping and the two variables are as related as cats and space travels - we have a place for skeptical kids here as well, I like you guys. In your case, a flat line would be your piece of art. What is your guessing?
To start our investigation, I would like to introduce you our dependent variable, the early pregnancy rate between 10 to 19 years old in Amazon municipalities. Yes, that's correct, you read just right: 10 years old. You are about to see, in numbers, how devastating the world can be. This is a trigger warning. A scientist's mental health is generally quite deteriorated - unfortunately, investigating humanity's problems have side effects.
Table 1: descriptive statistics of Early Pregnancy Behaviour
Raw data is from DATASUS, early pregnancy rate was independently calculated using Benevides et al. (2024).
When looking at the table, the first thing you should notice is: yes, the sample size is quite small (62 observations). The data is only from one year (2010), as this is the only year we have information about religious behavior at the municipal level for the Amazon state. Therefore, every conclusion we reach together must be analyzed with caution and suspicion. All the results are suggestive and do not represent the absolute truth by any means. In this sense, if the relationship really exists, we cannot say, for instance, that religiosity is causing early fertility. We can only suggest a correlational relationship between variables in a given direction.
Another valuable insight from this table is that the early pregnancy rate variable is fairly symmetrical and indicates low variability in the data. How can I know that? Well, first, the mean and the median are pretty close, indicating symmetry - and, on average, 32% of newborns in Amazon in the year 2010 were from girls between 10 and 19 years old. Quite high, isn't it? Second, the standard deviation is equal to 0.05, which means that the majority of municipalities have an early pregnancy rate close to the mean value, between 27% and 37%, with possibly no important extreme values.
Let's go a little bit deeper into the topic of extreme values. In statistics, they are called outliers, and what we do with them usually depends on whom you are talking to. Some people remove them from the sample and move on with their lives. Others think they are valuable information for understanding the phenomenon. Hence, they keep them in the sample and refine their methodology. I usually belong to the second group - not a huge fan of informational loss. However, this is a controversial topic in social sciences and, as usual, it depends on the context.
In our case, outliers do not seem to be a problem. The distribution is slightly skewed to the right, meaning the right tail is longer and a bit heavier. In practical terms, this means that we can have municipalities with very high pregnancy rates, quite distant from the mean, but they do not appear to be distorting our variable. I know this can be a little harder to understand, so, what about taking a look at the face of the variable?
Figure 1: histogram of early pregnancy behaviour
Raw data is from DATASUS, early pregnancy rate was independently calculated using Benevides et al. (2024).
Now I think it is easier to understand: see how we have more bars between 0.35 and 0.40 than we have between 0.20 and 0.25? That's precisely what the "skew" column in the table is telling us. Furthermore, the descriptive statistics are also telling us that the lowest early pregnancy rate in the sample is 19%, while the highest early pregnancy rate is the shocking number of 43%. In 2010, for the amazonian context, 43% of the newborns in some municipalities were from girls between 10 to 19 years old. Now you can see just how powerful numbers can be in forecasting social problems.
Finally, what are the Q0.25 and Q0.75 columns telling us? It's very simple: 25% of the municipalities in the sample have early pregnancy rates equal to or lower than 29%, while 75% of the municipalities have early pregnancy rates equal to or lower than 34%. With a simple and not-so-innocent table, you have enough information to understand why we should investigate this behavior – and so many questions can arise from these numbers.
Well, I have so many questions. Don’t you?
References
Benevides, A. de A., Sousa, A. O., de Sousa, D. T., & Mariano, F. Z. (2024). Does extending school time reduce the juvenile pregnancy rate? A longitudinal analysis of Ceará State (Brazil). EconomiA, ahead-of-print(ahead-of-print). https://doi.org/10.1108/ECON-11-2023-0192
Data cleaning and graphs were made by me using R.