Because time-specific covariates deal with how the parameter(s) varies over time , they are readily dealt with by modification of the design data (in RMark) or inputs into the design matrix (MARK). We will use an example with simulated data (where the covariate effects is known to operate) to illustrate both approaches. The data structure involves a single group of animals captured and recaptured over 10 occasions according to a CJS design. A single covariate value (cov) is measured on each occasion, and is assumed to affect survival (Phi). Because there are 10 occasions, there are 9 intervals over which survival is estimable, and so we will use 9 covariate values to predict survival.
RMark scripting approach
I have created a dataset and analyses for the above example, implementing a covariate analysis in RMark. The data object is here and the script file is here. Make sure the data object file is in your working directly and load it into the R session.
> require(RMark)
> #object containing time-specific covariates and data
> load("example.data")
The data object contains both the capture histories and the
time-specific covariates (again, for 9 survival intervals)
> example.data$cap.data[1:10,]
[1] "1000000000" "1110000000" "1000000000" "1000001011" "1000100000"
[6] "1001000000" "1001100000" "1000000000" "1000000000" "1000000000"
> example.data$cov
time cov
1 1 0.4779753
2 2 0.8640208
3 3 0.1563517
4 4 0.9302479
5 5 0.9729868
6 6 0.9939771
7 7 0.4513658
8 8 0.9179416
9 9 0.4315568
Note: in this example, we will default to the first occasions being numbered time=1, and the time values above agree with that convention. If we had set begin.time = 2013 (for example), the times above would need to be renumbered correspondingly.
The first step is to process the capture data as CJS and to create the design data.
> cap.processed=process.data(example.data$cap.data,model="CJS")
> cap.ddl=make.design.data(cap.processed)
We then need to add the covariate data to the design data. We can easily do this by using the merge_design.covariates() function:
> cap.ddl$Phi=merge_design.covariates(cap.ddl$Phi,example.data$cov)
By default, this is "by time (bytime=T)" so that the covariates for each time =1 to 9 line up with the parameter having the same time value in the ddl (hence the reason the numbering must agree between the 2). If we have groups involved then we must also specify "bygroup=T" (F by default).
At this point it is a good idea to display the ddl to make sure
>cap.ddl$Phi
Should confirm that the covariates have all been associated with the correct time occasion.
Now we are ready to create and run models. We will run a couple of Phi covariate models along with a few CJS models that do not involve covariates:
>#formulas
>Phi.cov<-list(formula=~cov)
>#time only
>Phi.t=list(formula=~time)
>p.t=list(formula=~time)
>p.dot=list(formula=~1)
>time.est<-mark(cap.processed,cap.ddl,model.parameters=list(Phi=Phi.t,p=p.t))
>null.est<-mark(cap.processed,cap.ddl)
>cov.est<-mark(cap.processed,cap.ddl,model.parameters=list(Phi=Phi.cov,p=p.dot))
>cov.est.pt<-mark(cap.processed,cap.ddl,model.parameters=list(Phi=Phi.cov,p=p.t))
>phi.time.p.dot<-mark(cap.processed,cap.ddl,model.parameters=list(Phi=Phi.t,p=p.dot))
results<-collect.models()
We will do some plotting and model averaging in a moment, but first let's export the data into MARK format (we'll need that shortly).
> results<-collect.models()
> #statements to produce input for MARK
> export.MARK(cap.processed, "CovExample", replace = TRUE, chat = 1, title = "CJS Covariate example", ind.covariates = "all")
Look the model results table
>results
This will indicate that the Phi(cov)p(.) = cov.est model has fairly overwhelming AIC support (> 0.97).
Let's first look a the real parameter estimates of Phi, which correspond (for the covariate model) to predictions of Phi at the observed covariate values. We pull off the first 9 indices because these correspond to the unique, time-specific Phi values (remember, no cohort effects), and merge the real values with the covariate values
> #plot under cov.est model
> real<-subset(get.real(cov.est,"Phi",se=T),all.diff.index<10)
> real<-merge(real,example.data$cov)
This sets us up to plot the relationship between the observed covariate values and survival (with CI) at each covariate value:
> plot(real$cov,real$estimate,xlab="covariate",ylab="Phi",pch=2,ylim=c(0.5,1),main="Model cov.est")
> matpoints(real$cov,real$lcl,pch=1)
> matpoints(real$cov,real$ucl,pch=1)
In general, we are going to want to do this averaging over all the models, although in this case it makes very little difference since the cov.est model is so heavily weighted relative to the others. First we compute the model averaged estimates, pick off the first 9 indices, and add the covariate value to the dataframe:
> mod.avg<-model.average(results,vcv=T)
> avg<-subset(mod.avg$estimates,par.index<10)
> avg$cov<-example.data$cov$cov
Then we can plot as above
>plot(avg$cov,avg$estimate,xlab="covariate",ylab="Phi",pch=2,ylim=c(0.5,1), main="Model averaged")
> matpoints(avg$cov,avg$lcl,pch=1)
> matpoints(avg$cov,avg$ucl,pch=1)
A couple of remarks at this points:
The above examples dealt with covariate relations to survival (Phi) but there are plenty of cases where time-specific covariate effects on recapture (p) are sensible. The approach is exactly the same, involving p instead of Phi with obvious changes in index numbering. For example the statements
>p.cov<-list(formula=~cov)
>Phi.cov<-list(formula=~cov)
mod.cov.both<-
mark(cap.processed,cap.ddl,model.parameters=list(Phi=Phi.cov,p=p.cov))
would produce a model with covariate effects for both Phi and p. To get the real estimates for p use a statement like:
> real<-subset(get.real(mod.cov.both,"p",se=T),all.diff.index<10)
Once you add this model to the set, redo the model collection to include it
results<-collect.models()
Model averaged estimates are bit trickier
> mod.avg$estimates
includes all the real (all different index) parameters numbered 1 to 90. You can see from the cap.ddl that the p estimates are 46-90 (confusingly this is termed model.index, second column in the ddl)
To get the unique time specific p's you need indices 46-54:
avg.p<-subset(mod.avg$estimates,par.index>45&par.index<55)
Plotting proceeds as above, with obvious modifications.
A second issue arises if interest focuses on prediction of survival or capture at covariate levels other than that observed. It turns out there is no easy (and correct) way to do this for time-specific covariates, since the covariate values are tied directly to the design matrix. A solution to this issue involves artificially treating the time-specific covariates as a series of "individual covariates" (that actually have identical values across individuals) , one for each time period, and using the covariate.predictions() function designed for individual covariates . We will come back to this example because the same approach is used to deal with individual covariates that do vary over time.
MARK GUI
Read in the Covexample.inp data into the MARK session, specifying Live Recaptures, 1 group and 10 occasions. I have already completed this step and created a project with several models; see zip folder containing the MARK project files and input data. You can either start from scratch (blue checked box in menu) or open the project files (yellow folder icon).
To insert a time specific covariate we first need to have define a PIM structure with time effects. I did this by running models with Phi(t) and p(t) . I then retrieved the Phi(t) p(t) model (right click), then go to the main menu and select 'design / reduced'. Keep the default which is 18 parameters .
We need to fill this in to create the covariate design model. The first column will be all 1's, representing the intercept; you can use the fill column tool to specify a partial intercept for 9 rows to make this faster. The time-specific covariate values are entered in the corresponding occasion row in column 2; these can either be typed or copied and pasted from a spreadsheet. Note that MARK will color these cells red but the cells with 1's blue; this is a reminder that these are covariates (and not a mistake, which ordinarily a number not 1 or zero in the design matrix would be).
We then need to build the model for p. Again, add a 9-row intercept column. Then use the partial diagonal tool to specify time effects (8). Finally, delete columns 12-18, since these are not needed. The result should look like this:
We can easily build a Phi(cov)p(.) model, simply by deleting the p time effects (columns 4-11). I create 2 covariate models and 3 "standard" models, the same ones as we did in RMark earlier. Here is the summary table:
Compare this to the model summary in RMark. You should see that the results are virtually the same. The only difference is the AIC value for Phi(t)p(t), which MARK claims has 17 parameters although the design clearly has 18. Model averaged real parameters are obtained by selecting 'output/model averaging/' and checking the top Phi row to get the unique time-specific estimates. I like to also check the 'export to spreadsheet' option, which provides a handy summary. You can confirm that the model-averaged Phi's are virtually identical to the RMark result, in this case because the divergent Phi(t)p(t) model had very low weight.
From this point on, to produce graphs and other summaries one must export these results to a spreadsheet or other program. MARK has a certain number of limited graphing options for parameters of individual models, but is not really designed to handle these sorts of tasks. In my view, this is one area where RMark certainly makes life easier!
Next: Individual covariates