Dealing with age structure
The problem
First, the meaning of "age" in CJS models can be confusing. Fundamentally, the interpretation of age can depend upon (among other things)
What is field identifiable as far as age (i.e., specific chronological age vs. age classes)
How the data are gathered and thus whether age cohorts can be identified.
A second and somewhat related issue is the timing of transition from one age to the next and how it relates to the intervals between recapture periods.
Studies of passerine birds typify these issues. Usually, birds can be identified as hatching year (age 0 yr) or adults (age >0) and sometimes at subadult (second year age =1) but not field identified to age >1 unless they were originally captured as age 0 animals and then subsequently recaptured or resighted (e.g., bird is captured as a chick in 2000, recaptured in 2005; it is known to be 5 years old). On the other hand, if birds are first captured as "adults" (age >1) there is often no easy way of telling exact age, and birds in the "adult" age class will in any case typically be mixtures of birds from different cohorts of chicks.
Certain kinds of data structures lend themselves to inference about age-specific survival; others do not. For example, if we mark and then recapture or resight a given age cohort of animals over time, our observations of their fates depend (besides on recapture rates) on their survival over each subsequent year of life. But that survival will be influenced not just by age factors, but also by enviromental variation. In fact in some years, animals of all ages will exhibit higher (or lower) survival than in other years, because of environmental effects. In order to tease apart age effects from temporal effects, we ordinarily need to observe overlapping cohorts of animals-- so, detections of animals that were hatched (born) in different prior years. If we are continuously marking animals from new cohorts, and recapturing them in later years, we get this kind of cohort survival data. As an aside, this kind of information is lacking in tag-recovery studies of animals marked only as young, as we will see when we consider this type of data later.
Another -- and very common way-- that we get information allowing separation of age and time effects, is if we have at least some captures and releases of both 'juvenile' (age =0) and 'adult' (age >0) animals in every year. So, for birds this means that we capture and mark both juveniles and adults every year. If we only capture adults, we can still estimate temporal (but not age) survival effects. If we only capture juveniles, but do so in the preceding overlapping fashion, we may still be able to get both age and time effects. Because analysis of animals captured both as age=0 and age>0 allows separation of age and time effects under both CMR and tag recovery data structures, I will give this type of data particular emphasis, but note that overlapping cohort data with CMR also allows very robust (and in some ways simpler) analysis.
As noted above, the timing of age transition with respect to sampling is also an issue in CMR studies. Often, CMR occasions are spaced 1 year apart, and for many organisms (like birds) a year corresponds closely to the interval over which "aging" takes place, so that if we capture birds this year at age=0 they will be age =1 next year, etc. However, we can easily envision different scenarios, such as CMR intervals <1 year (seasonal or more frequent recaptures), in which case age transition (+1) may take several occasions. Other situations can occur, such occasions spaced >1 year apart, so that age transition will occur between capture occasions. The example below is based on annual sampling of a bird, the most "usual" case, but other cases can be handled by appropriate interpretation and modeling of time and cohort indices in the design data.
I illustrate age-specific CJS data structure and models with an example from Serins (a finch) ringed in northeastern Spain starting in 1990. Birds were ringed (banded) as 2nd-year ("young") and after-second-year ("adult") males and females (data on hatching year birds, whose sex could not be determined with certainty, are excluded). The data were originally analyzed in MARK and so need to be converted from MARK to RMark format. Note that 4 groups of birds are identified, based on sex and age at first capture:
> #set a working directory for reading and writing files
> #setwd("c:/mydir")
>
> #data_dir<-"C:/Documents and Settings/conroy/My Documents/Dropbox/teaching/WILD8390/spring2014/CJS")
> data_dir<-"C:/users/mike/Dropbox/teaching/WILD8390/spring2014/CJS"
> setwd(data_dir)
> rm(list=ls())
> #serin data in MARK format
> input.file<-"serin2age.inp"
> #define group labels
> group.df<-data.frame(sex = c("M","M","F","F"),age=c("A","Y","A","Y"))
> #convert to RMark format
> serins<-convert.inp(input.file,group.df=group.df)
> serins.processed=process.data(serins,model="CJS",begin.time=1990,groups=c("sex","age"))
> serins.ddl=make.design.data(serins.processed)
At this point, we need to define an age variable appropriate to the design and data collection scheme. We can do this (there are probably other ways; this one works!) by creating a variable that starts out as 0 for SY and 1 as ASY based on age at first capture (initial.age.class, part of the ddl definition) . To this is added "Age", a variable that starts out as 0 for the release cohort and increments by 1 for each subsequent year. Since we need to only classify animals as SY or ASY ("Y" or "A"), this computation needs to be reduced to 0 (SY) or 1 (AHY or older), and is accomplished by the following:
> #create a age.now variable. This is 0 if SY bird and 1 if ASY at current occasion
> #e.g., if a bird was captured as 0 in 1990 it transitions to 1 in 1991 and stays 1
> # if it is 1 in 1991 it stays 1 (nothing happens)
> #note this is done only for Phi by definition first recapture always occurs as 'adult' so p is not age-specific
> serins.ddl$Phi$age.now<-(((serins.ddl$Phi$initial.age.class=="A")*1+serins.ddl$Phi$Age)>0)*1
The new "age.now" variable is what will be used in subsequent modeling of age effects.
> #parameter formulas. Include age.now as age variable in all the relevant formulas
> Phi.dot=list(formula=~1)
> Phi.sex=list(formula=~sex)
> Phi.age=list(formula=~age.now)
> Phi.Sex.Age=list(formula=~sex*age.now)
> Phi.sex.age=list(formula=~sex+age.now)
> Phi.t=list(formula=~time)
> Phi.sex.t=list(formula=~sex+time)
> Phi.age.t=list(formula=~age.now+time)
> Phi.sex.age.t=list(formula=~age.now+sex+time)
>
> p.dot=list(formula=~1)
> p.sex=list(formula=~sex)
> p.t=list(formula=~time)
> p.sex.t=list(formula=~sex+time)
>
Finally, these parameter formulations are put together in a series of combinations to form models (36 in all), and produce and plot model-averaged estimates of real parameters (see the attached script).
This type of data structure and analysis allows for some very general questions to be addressed, which could include:
Tests for age- and sex-specificity in survival
Separation of age-specific from time-specific (environmental) effects
Modeling of individual and time-specific covariate effects
We will later see how to also include random and hierarchical effects into this general model structure.