As we've just seen, we have a couple of ways to estimate recruitment from CMR data, but these are hampered somewhat by the fact that regular JS or Pradel CMR designs and analyses do not take into account possible capture heterogeneity or behavioral response. These influences are known to cause severe biases in estimates of abundance or recruitment. The Robust Design was originally envisioned as a way of getting around this problem by a design in which the data are collected in a nested fashion:
Primary periods are spaced out over intervals during which the population is assumed to be open to additions (births, recruitment) and losses (deaths, permanent emigration).
Secondary occasions are typically conducted over shorter periods, during which the population is assumed to be closed.
The original RD application to the recruitment problem then used CJS models to estimate survival over the primary periods, closed abundance CMR models to estimate N for each primary period with the secondary occasions as replicates, and put the 2 pieces of information together to estimate recruitment. Modern approaches perform an integrated analysis of both types of data. We will illustrate this approach by application of the RD extension the Pradel temporal symmetry approach.
The data example comes from a study by Jim Nichols of male Microtus pennsylvanicus trapped at Patuxent Wildlife Research Center, Maryland over 6 months, and daily over 5-day periods each month (cited in Chapters 17-19 of Williams et al.). The data are here, in MARK input format.
To analyze these data, we must first read them into R and reformat them for the RMark program. I do that in the attached code, but first I analyse the secondary data within each primary period using a series of closed models. The reason for this is to both to do a cross check with the more complicated Robust analysis, but also to provide preliminary estimates of abundance in case these are needed as starting values.
Implementation in RMark
Moving on to the Robust Design analysis, we first convert the input data to RMark format
>#convert to RMark format
>mouse<-convert.inp(input.file)
We then must specifiy time intervals for Robust Design. Here a "0" indicates we are still within a closed period (so, across secondary samples within a primary occasion), and a "1" indicates that one interval of time (e.g., a year) has elapsed between samples. So:
>intervals<-c(rep(c(0,0,0,0,1),5),c(0,0,0,0))
>intervals<-c(rep(c(0,0,0,0,1),5),c(0,0,0,0))
>#set up Pradel model gamma parameterization
>mouseG.processed=process.data(mouse,model="RDPdGClosed",time.intervals=intervals)
mouseG.ddl=make.design.data(mouseG.processed)
Next, we will specify some sources of variation in each of the key parameters (p, c, Phi, and Gamma). Note that below for p and c (capture and recapture probabilities) "session" refers to periods between primary occasions, whereas "time" refers to secondary occasion within primary.
On the other hand "time" for Phi and Gamma always refers to primary occasion, since these parameters have no meaning within primary period.
> #### seniority model model parameters
> #model parameters
> p=dot=list(formula=~1)
> cdot=list(formula=~1)
>
> p.t.s=dot=list(formula=~session+time)
> c.t.s=list(formula=~session+time)
> Phi.t=list(formula=~time)
> G.t=list(formula=~time)
> Phi.dot=list(formula=~1)
> G.dot=list(formula=~1)
>
We then put these together into a series of models; I have run 5 here, but many more are possible. Note that these models are all "Closed" models and do not involve individual heterogeneity in capture. Models corresponding to the finite mixture models we considered earlier could readily be built. For example specifying the model structure RDPdGFullHet instead of RDPdGClosed in process.data() will allow for individual heterogeneity via the mixture approach.
One run these models we can summarize them by AIC and also get model-averaged estimates of the real parameters.
> mods<-collect.models()
> estims<-model.average(mods)
From the mods object we can extract estimates of our various parameters of interest. Note that to understand the indexing of the real parameters, it is useful to go back to the .ddl object.
Model-averaged estimates of abundance, Phi, and Gamma at each primary period are extracted from this object by
> estims<-model.average(mods)
Note that the "real" estimates for N are actually estimates of U: the number of unmarked animals in the population. We get N by adding M, the number of marked animals, to N=M+U. Finally, since these are closed (within primary period) models, we know M simply from the number of unique animals that were captured and marked, which is just the number of rows of the capture history matrix.
> mod_avg_N<-estims[65:70,]
> mod_avg_N<-subset(mod_avg_N,select=c(estimate,se))
> ################
> mod_avg_N$marked<-marked
>
> mod_avg_N$Nhat<-marked+mod_avg_N$estimate
> mod_avg_N$Nhat.se<-mod_avg_N$se
>
> #model averaged phi
>
> mod_avg_phi<-estims[1:5,]
>
> #model averaged gamma
> mod_avg_gamma<-estims[6:10,]
>
Finally, the model-averaged estimates of Phi and Gamma are also capable of producing estimate of population growth rate (Lambda) and per-capita recruitment (f) by the fundamental relationships between Gamma, Phi, Lambda, and f summarized here and elaborated in more detail in Williams et al. (2002). These relationships give us a straightforward means of getting point estimates of Lambda and f given estimates of Phi and Gamma; estimates of standard error require approximation methods such as the Delta method, since for example computation of Lambda requires division, a nonlinear function.
>
> #model averaged gamma
> mod_avg_gamma<-estims[6:10,]
>
> #function to do delta approximation of variance of x/y
> delta_div<-function(est1,est2,se1,se2){
+ std.err<-array(dim=length(est1))
+ for(i in 1:length(est1))
+ {
+ fd<-c(1/est1[i],-est1[i]/est2[i]^2)
+ d<-matrix(data=fd,ncol=2)
+ v<-diag(c(se1[i]^2,se2[i]^2))
+ std.err[i]<-(sqrt(d%*%v%*%t(d)))
+ }
+ return(std.err)
+ }
>
> ests<-data.frame(phi=mod_avg_phi$estimate,se.phi=mod_avg_phi$se,gamma=mod_avg_gamma$estimate,se.gamma=mod_avg_gamma$se)
> ests$lambda<-ests$phi/ests$gamma
> ests$se.lambda<-delta_div(est1=ests$phi,est2=ests$gamma,se1=ests$se.phi,se2=ests$se.gamma)
> ests$f<-ests$lambda-ests$phi
> ests$se.f=sqrt(ests$se.lambda^2+ests$se.phi^2)
> print("Model averaged estimates of phi,gamma, f and lambda")
[1] "Model averaged estimates of phi,gamma, f and lambda"
> ests
phi se.phi gamma se.gamma lambda se.lambda f se.f
1 0.8626606 0.05151162 0.6486825 0.03011459 1.3298657 0.08589042 0.4672052 0.1001529
2 0.5547355 0.06245511 0.6518652 0.03274127 0.8509973 0.12042608 0.2962618 0.1356580
3 0.7288340 0.06799952 0.6496342 0.03042859 1.1219144 0.10708040 0.3930804 0.1268469
4 0.5905528 0.06232372 0.6502959 0.03119877 0.9081293 0.11417429 0.3175765 0.1300770
5 0.9497819 0.08284529 0.6466682 0.03231988 1.4687313 0.11400316 0.5189494 0.1409257
>
Here is the R script to perform the above analyses.
We will use the Pradel Gamma (seniority) parameterization, which is the original form of this model. As we will see below, however, other perhaps more useful statistics can result from estimates of Gamma.
>#set up Pradel model gamma parameterization
>mouseG.processed=process.data(mouse,model="RDPdGClosed",time.intervals=int>ervals)
>mouseG.ddl=make.design.data(mouseG.processed)
Next, we will specify some sources of variation in each of the key parameters (p, c, Phi, and Gamma). Note that below for p and c (capture and recapture probabilities) "session" refers to periods between primary occasions, whereas "time" refers to secondary occasion within primary. On the other hand "time" for Phi and Gamma always refers to primary occasion, since these parameters have no meaning within primary period.
> #### seniority model model parameters
> #model parameters
> p=dot=list(formula=~1)
> cdot=list(formula=~1)
>
> p.t.s=dot=list(formula=~session+time)
> c.t.s=list(formula=~session+time)
> Phi.t=list(formula=~time)
> G.t=list(formula=~time)
> Phi.dot=list(formula=~1)
> G.dot=list(formula=~1)
>
We then put these together into a series of models; I have run 5 here, but many more are possible. Note that these models are all "Closed" models and do not involve individual heterogeneity in capture. Models corresponding to the finite mixture models we considered earlier could readily be built. for example speciftying the model structure RDPdGFullHet instead of RDPdGClosed in process.data() will allow for individual heterogeneity via the mixture approach.
One run these models we can summarize them by AIC and also get model-averaged estimates of the real parameters.
> mods<-collect.models()
> estims<-model.average(mods)
>
From this object we can extract estimates of our various parameters of interest. Note that to understand the indexing of the real parameters, it is useful to go back to the .ddl object.
Model-averaged estimates of abundance, Phi, and Gamma at each primary period are extracted from this object by
> estims<-model.average(mods)
Note that the "real" estimates for N are actually estimates of U: the number of unmarked animals in the population. We get N by adding M, the number of marked animals, to U N=M+U. Finally, since these are closed (within primary period) models, we know M simply from the number of unique animals that were captured and marked, which is just the number of rows of the capture history matrix.
> mod_avg_N<-estims[65:70,]
> mod_avg_N<-subset(mod_avg_N,select=c(estimate,se))
> ################
> mod_avg_N$marked<-marked
>
> mod_avg_N$Nhat<-marked+mod_avg_N$estimate
> mod_avg_N$Nhat.se<-mod_avg_N$se
>
> #model averaged phi
>
> mod_avg_phi<-estims[1:5,]
>
> #model averaged gamma
> mod_avg_gamma<-estims[6:10,]
>
Finally, the model-averaged estimates of Phi and Gamma are also capable of producing estimate of population growth rate (Lambda) and per-capita recruitment (f) by the fundamental relationships between Gamma, Phi, Lambda, and f summarized here and elaborated in more detail in Williams et al. (2002). These relationships give us a straightforward means of getting point estimates of Lambda and f given estimates of Phi and Gamma; estimates of standard error require approximation methods such as the Delta method, since for example computation of Lambda requires division, a nonlinear function.
>
> #model averaged gamma
> mod_avg_gamma<-estims[6:10,]
>
> #function to do delta approximation of variance of x/y
> delta_div<-function(est1,est2,se1,se2){
+ std.err<-array(dim=length(est1))
+ for(i in 1:length(est1))
+ {
+ fd<-c(1/est1[i],-est1[i]/est2[i]^2)
+ d<-matrix(data=fd,ncol=2)
+ v<-diag(c(se1[i]^2,se2[i]^2))
+ std.err[i]<-(sqrt(d%*%v%*%t(d)))
+ }
+ return(std.err)
+ }
>
> ests<-data.frame(phi=mod_avg_phi$estimate,se.phi=mod_avg_phi$se,gamma=mod_avg_gamma$estimate,se.gamma=mod_avg_gamma$se)
> ests$lambda<-ests$phi/ests$gamma
> ests$se.lambda<-delta_div(est1=ests$phi,est2=ests$gamma,se1=ests$se.phi,se2=ests$se.gamma)
> ests$f<-ests$lambda-ests$phi
> ests$se.f=sqrt(ests$se.lambda^2+ests$se.phi^2)
> print("Model averaged estimates of phi,gamma, f and lambda")
[1] "Model averaged estimates of phi,gamma, f and lambda"
> ests
phi se.phi gamma se.gamma lambda se.lambda f se.f
1 0.8626606 0.05151162 0.6486825 0.03011459 1.3298657 0.08589042 0.4672052 0.1001529
2 0.5547355 0.06245511 0.6518652 0.03274127 0.8509973 0.12042608 0.2962618 0.1356580
3 0.7288340 0.06799952 0.6496342 0.03042859 1.1219144 0.10708040 0.3930804 0.1268469
4 0.5905528 0.06232372 0.6502959 0.03119877 0.9081293 0.11417429 0.3175765 0.1300770
5 0.9497819 0.08284529 0.6466682 0.03231988 1.4687313 0.11400316 0.5189494 0.1409257
>
Here is the R script to perform the above analyses. The script produces an input file for use in the MARK GUI, below. I provide this for general information, but it is not actually needed here, since we started with the mouse_robust.inp in MARK format.
Implementation in MARK
Models corresponding to those above can be created using the MARK gui and the mouse_robust.inp file. The user must select the data and model type from the Data Type list. Select "Pradel Models Including Robust Designs" and (from the next panel) "Robust Design Pradel Survival and Seniority". Finally, select "Full Likelihood p and c", which will allow non-mixture modeling of capture and recapture similar to the models we created in R. You then need to specify number of groups (1) and capture occasions (30), and the appropriate Robust survival intervals. Use the Easy Robust Design Times tool (right side of screen) and enter 6 primary occasions. The pop up will give you the default selection of 5 secondary occasions each primary, which can be retained for this example. Then proceed to the Run menu to select or build models.
These steps should be familiar by now and won't be repeated here; I have however asked you to construct some MARK gui models as part of the review exercises.