subscribe to posts here
Recent Entries
-
HMDget, version 1.26 posted
This version update fixed a couple bugs, but includes no structural overhaul. Most notable bug fix, credit to the most notable bug finder: Felix Rößger! The drop.tadj argument was only working for single population counts, but not 5-year age groups. Fixed and fixed! It's not a solid or exemplary piece of code, but the more bugs people tell me about, the better it gets! Also, just a reminder, the most up to date downloads are always on the downloads page, not the attachments at the bottom of the R Code page, which include nearly all earlier versions of like everything, and not necessarily the newest stuff.
If you see this and will be at the 2012 PAA or live in the Bay Area, shoot me a line! |
Posted Mar 7, 2012 9:19 AM by Tim Riffe
-
The relativistic speed of demography, Lexis surfaces revistited, again
------------------------------- // Another note in the name of weird demography \\ ------------------------------
Recall that there has been 1 generation recorded whose cohort life expectancy was sqrt(2) times life expectancy at birth, the "Da Vinci"-like 1866 cohort of Danish males that I was so excited to locate in the HMD and mentioned here. In the name of sanity, I ought to point out that that allusion was not all that accurate, since calendar time does not get worked in at all. The cohort/age relation works, but somewhere time itself gets lots in the mix. Well, there is one condition under which all three relations would hold: 1 calendar year = 1 year of age = sqrt(2) years lived, and that is the following:
Imagine a large spaceship, with a large population of humans on it producing babies and some demographers that want to study these humans and their offspring. IFF the spaceship is moving at a constant speed of sqrt(1/2)*c (.707 times the speed of light [this is theoretically feasible]), and successive birth cohorts are dropped off on static space stations with automatic data measurement time-synchronized perfectly with the demographers' clocks at the time off drop-off and with little gravity (e.g. Deep Space 9; Death Star too big) along the way, but the demographers remain on the ship in calendar/age reference space, then standard Lexis proportions would roughly represent the 1-1-sqrt(2) relationship that we see on paper. Namely demographers would experience perfect AP relationships on the spaceship frame of reference, but when returning to space stations to collect their automatically generated demographic data (say after the cohorts are extinct, or in periodic data pick-ups), they would notice that the cohorts had lived sqrt(2) years for each year that had passed on the demographers' clocks. This is known as the twin paradox. These temporal relationships they would render on paper just like the Lexis diagrams that we know and love.
* this is all assuming that acceleration, deceleration do not matter; i.e. when the demographers turn around to pick up the data, they decelerate instantly and are instantly moving back towards the space stations in the same speed, something that is impossible (unless of course the universe is cyclindrical)- so technically this is all only true in the limit, and even then the Lexis surfaces they derive would still have a small Tuftian lie factor.
|
Posted Feb 25, 2012 10:42 AM by Tim Riffe
-
Tool repositories!
I've mentioned the Applied Demography Toolbox before, just wanted to reiterate that it has grown since my last mention, and that it is continually undergoing expansion. Check it out if you haven't already.
The methods included there are mostly indirect and semidirect (i.e. exciting) for dealing with deficient and/or defective data, which is par for the course in many contemporary countries, such as DHS countries, as well as in historical demography, such as I've been fiddling around with of late. It's really user-friendly too- they go into the assumptions, the checks, everything, a proper manual. I dig it and will likely look to it in the future for guidance.
I'd like to give a heads up to other R-loving demographers that Fans Willekens will be coming out (no date announcement yet) with a book in the Springer UseR series for his Biograph package, which includes tools for estimating transition rates from surveys, and other neat life course tools. For now there is the manual, which includes plenty of examples. Awesome.
|
Posted Feb 8, 2012 6:04 AM by Tim Riffe
-
Years beget years
At the Centre d'Estudis Demographics, students learn a bit about the concepts behind Louis Henry's idea of 'reproduction of years lived' ( here's the main reference, including commentary from Jean Bourgeois-Pichat). There is a lineage to this idea, as the center director, Anna Cabré is of the French school, and students of hers, such as Julio Pérez have carried on the idea. And to this I add Juan Antonio Fernández Cordón. Now me! The neat point behind this concept is that if fertility rates fall below the conventional wisdom level of replacement while longevity rises, a population may still be replacing itself- the replacement is just stretched over time. Eh? Say a birth cohort of 1000 people gives birth over it's lifetime to 900 children. Conventionally, we'd say this this cohort failed to reproduce itself. Boo. Say the e0 of the original cohort was 65. They then lived a total of 65000 years. Say their childrens' cohort life expectancies averaged out to 73. Then the children will have lived for 72 * 900 = 65700 years. MORE YEARS.
Is this not progress? That is to say, potentially less effot devoted to biological reproduction, but more years of human life on the planet? Such lifespan increases are not unprecedented: take the Vaupel average rate of e0 improvement of .25 (hours per hour, years per year). If the children of the first cohort were born on average, say 32 years after the birth of the 'parent' cohort, we'd expect 8 years more of life expectancy for the children. I don't know what's going to happen in the future, but this kind of number trick has plenty of precendent in the historical record.
It's a pain in the booty to calculate (directly) though. I did my best using the 1870 Swedish cohort as a reference, and these were my steps:
1) e0 for 1870 Sweden (males and females together) was 50.14 (taken straight from the HMD)
2) We want to compare the 1870 cohort's e0 with the average e0 of their offspring. This means we need to take the weighted mean of several cohort e0s, those 12-55 years afterward, which are available from the HMD up until 1919 as of this writing. To fill in for 1920-1925 I just did a linear extrapolation, since the rate of improvement over these cohorts was linear (good luck for me). To weight these, I decided to use NTFR, discussed in the prior post- makes sense to me.. I thus needed 2 more items to do the weighting- 1870 cohort ASFR and 1870, available from the HFD starting in 1892. Again, boo, since I want these rates starting in 1882. I assumed a rate of 0 for age 12 and used a monotonic spline to interpolate (the Hyman spline, available from Rob J Hyndman, and which I just call into R using source("http://robjhyndman.com/Rfiles/interpcode.R"). This is an awesome tool for filling out the tails of rate distributions. With this, I had the full fertility curve for 1870 females, which when multiplied with the vector of lifetable exposures, female Lx (also HMD), gives the net age-specific fertility rates. This vector, weights the children's cohort e0s. Final number: 60.35.
3) the initital size of the 1870 cohort (all together) was 119838.
4) the 1870 cohort gave birth to a total of 118504.6 children +/- a few (HFD cohort births, plus some data mungery for years 1882-1891).
* So, 119838 people produced 118504.6 people, meaning, so only 98.887% of the job completed? * However, the children lived 60.35/50.14 = 1.203684 = 120% longer, on average. * and so the children lived a total of .98887 * 1.203684 = 1.190291 = 119% more total years than the 1870 cohort.
There you have it. The years produced in 1870 Sweden contributed an additional 19% more years to Swedish humanity, despite not having replaced their own cohort size.
|
Posted Feb 10, 2012 1:11 AM by Tim Riffe
-
net total fertility
Today something classic. A neat little exercise. Usually to look at the reproductive efficiency of a population we would caluclate the net reproduction rate, R0, sum(fxf*Lx), where Lx are the lifetable exposures of females calculated with a starting population (radix) of 1, and where fxf are fertility rates, but only taking into account female births. I have the impression that it's not so common to cite R0 when writing demographic reports for contemporary countries, but it really is interesting, and very informative when decomposed over time or between populations.
I've been thinking with a friend of mine about what kind of measure one might like to use to approximate the reproductive efficiency of a population. Total Fertility (TFR) for a given population is a measure of how many births a mortality-free cohort of females would have per capita, assuming the rates observed over all ages and cohorts in a particular year. Of course, it's fraught with tempo and quantum danger, but it's still a good point of departure.
Let's make TFR not mortality-free. Like, assume 1000 women (err. 1000 female person years) ages 20-24 had 150 births, for a fertility rate of 0.15. If 1100 females had to be born in order for those 1000 person years to happen, then 100 were lost along the way (this is an island and they don't have boats). Imagine that the females that died would have had the same fertility as the ones that lived, and now we make them not die: Then the rate of 0.15 gets multiplied by 1100 instead of by 1000, and we would have seen 165 births, 15 more than those actually observed. Hmmmm.
Now let's make our lives easier. Where TFR is the sum of observed age-specific fertility rates in a year, which is the same as sum(fx*1). TFR discounted for mortality looks strikingly similar to R0: sum(fx*Lx). Let's call this last one net total fertility (it probably already has a name, but I'm not going to leaf through a manual just now). Net TFR is always lower than TFR. The difference can be thought of as potential births lost to the mortality of potential mothers. Here's what they look like for almost 120 years in Sweden: Hopefully my terminology isn't too far off. Anyone perceive phantom grid lines in the white space? Pretty cool eh? The ratio of these two components is my running definition of reproductive efficiency: That about sums it up. This second graph is purged of the overall level of fertility, just a pure measurement of the efficiency of fertility in a given year. Not all that representative of any particular cohort, though it could be calculated in the same way for cohorts. And now I have to run unexpectedly. I'll post the R code and data tomorrow.
----- And back! I've omitted the data-munging aspects and just start with a single matrix that contains everything we need, attached at the bottom of this post.. For reference, the Lx column is just the Lx calculated by the HMD in the file called fltper_1x1.txt for Sweden- I divided it by 10000 to give it a radix of 1. ASFR is from the HFD for Sweden, using the file called asfrRR.txt (period). Calculating some off-beat measuresload("SE.Rdata") head(SE) # console output: Year Age Lx ASFR [1,] 1891 0 0.93621 0 [2,] 1891 1 0.88843 0 [3,] 1891 2 0.86543 0 [4,] 1891 3 0.85036 0 [5,] 1891 4 0.83887 0 [6,] 1891 5 0.82990 0
years <- sort(unique(SE[, "Year"])) N <- length(years) TFR <- NETTFR <- OH <- Eff <- c() for (i in 1:N){ ind <- SE[,"Year"] == years[i] Fx <- SE[ind, "ASFR"] Lx <- SE[ind, "Lx"] TFR[i] <- sum(Fx) # TFR NETTFR[i] <- sum(Fx * Lx) # net TFR OH[i] <- TFR[i] - NETTFR[i] # overhead Eff[i] <- OH[i] / TFR[i] # efficiency } Code for the first plot: mortality overhead for period fertilityplot(years, rep(0, length(years)), type = "n", ylim = c(0, 6), xaxs = "i", yaxs = "i", axes = FALSE, xlab = "", ylab = "", sub = "Conventional TFR is the sum of these two components", main = "Period fertility discounted by female morality\nSweden 1891-2008, HFD&HMD") polygon(c(years, rev(years)), c(rep(0, N), rev(TFR)), col = gray(.7), border = NA) polygon(c(years, rev(years)), c(TFR, rev(TFR + OH)),col = gray(.3), border = gray(.3), lwd = .1) abline(v = seq(1900, 2000, by = 20), col = "white", lwd = 1.5) abline(v = seq(1890, 1990, by = 20), col = "white", lwd = .5) abline(h = seq(1, 5, by = 1), col = "white") text(1890, seq(0, 5, by = 1), seq(0, 5, by = 1), pos = 2, xpd = TRUE) text(seq(1900, 2000, by = 20), - .2, seq(1900, 2000, by = 20), xpd = TRUE) legend("topright", fill = c(gray(.3), gray(.7)), legend = c("TFR lost to mortality", "net TFR"), bty = "n") text(1885, 5.5, "Births", xpd = TRUE) I did my best to follow Tuftian principles in these figures, comments welcome. Perhaps a bit too artisanal for the likes of most.
Code for the second plot: Fertility efficiencyplot(years, Eff, type = 'n', main = "% of potential TFR lost to female mortality\nSweden, HMD&HFD", xaxs = "i", yaxs = "i", ylim = c(0, .3), axes = FALSE, xlab = "", ylab = "") polygon(c(years, rev(years)), c(Eff, rep(0, length(years))), col = gray(.7), border = NA) abline(v = seq(1900, 2000, by = 20), col = "white", lwd = 1.5) abline(v = seq(1890, 1990, by = 20), col = "white", lwd = .5) abline(h = seq(.05, .25, by = .05), col = "white") lines(years, Eff, col = gray(.5)) text(1890, seq(0, .25, by = .05), c("0%", "5%", "10%", "15%", "20%", "25%"), pos = 2, xpd = TRUE) text(seq(1900, 2000, by = 20), - .01, seq(1900, 2000, by = 20), xpd = TRUE) text(1950, - .03, "Year", xpd = TRUE) So why this measure instead of net reproduction, R0?: Because I think the idea of a monosex population is weird! On the other hand, the present measure isn't all that great either because I only discounted for female mortality. The two sex problem is EVERYWHERE. Look over your shoulder NOW!
--- and again, with a fertility retrojection! --------- So, the HMD give mortality estimates back to 1751 in Sweden, but the HFD goes back 'just' to 1891. With no ASFR or estimate thereof prior to 1891, we can't bring this indicator back any farther in time. So here's my proposal:
1) we take the HFD asfr curves for 1891-1901, and turn each into a pdf (i.e divide each rate by the sum of all rates, such that now all curves sum to 1). 2) now we have 10 fertility PDFs, while there is a time trend in how they change (bummer), it isn't that great. Here's an image of all 10 curves with the age-specific mean (which yay arithmetic also sums to 1): just a diagnostic. The black line will be the 'standard' pdf that we retroject.
3) Now for each year prior to 1891, multiply the standard curve, above, with the female exposures given in the HMD, then sum to arrive at a false sum of births 4) haha! The HMD has already given us a long time series of estimated births, back to 1749, so we'll treat these totals as the 'real' births. Then the ratio of (real/false) is our scale factor for the standard curve. 5) voila, in each year we rescale the standard pdf according to the ratio in step 4, arriving at an ASFR estimate. 6) the Lx is already in the HMD back to 1751, and now we have an estimate of ASFR back to 1751, and that's all we need to calculate THIS: Note that I went ahead and did 1 - Eff so that it's easier to understand the percentage as the concept of 'efficiency'. I totally dig this result. The famines and such really shine, but really, the 'efficiency' concept really lends itself more to cohorts than to periods, so sometime I'll have to come back and recalculate this- when I do, we'll see a much less noisy graph I think.
[edit on Feb 6th, paraphrasing some comments I received today from Adrien Remund and my responses :] 1) isn't this basically the same as R0/fxf?: yes, they should covary almost perfectly. No big deal which measure we take I don't think, I just prefer to speak of the whole population. 2) what about other fertiltiy shapes- is this a strong assumption?: I did a bit of sensitivity testing by moving the above standard curve 2 ages to the left and 2 ages to the right. When reproducing this last figure, the results don't vary all that much. A later fertility curve is subject to higher mortality, and so needs to have a higher TFR, but will have a lower reproductive efficiency and vice versa. When shifting years in either direction, it doesn't make that big of a difference, and definitely doesn't change the story. I may come back and post those results at a later date, and when more complete. Notably, I didn't check for other shapes, such as those that are left-leaning, i.e. sharp increases in fertility at the beginning and a fat tail on the right, rather than the near-bellcurve used here (which may have been coincidental to the reference years). I'll try out different shapes, and maybe some assumptions of changing shape over time, but don't anticipate major changes in results. Good call though Adrien.
3) Adrien rocks, because he goes ahead and busts out an Excel to try things out, and he even sent me along the hutterite schedule to compare that shape! He's full of golden references. By the way, as soon as he says 'jump' I'm going to write up the awesome paper he'll presenting this year at the EPC in Stockholm. Stay tuned fertility and second demographic transition folks! |
Posted Feb 6, 2012 10:41 AM by Tim Riffe
The inelegant old-school 1-page blog has now migrated to the 'blog' tab on the left. This page only shows the 5 most recent entries.
|
|