Although the Robust Design RD was originally developed specifically for the problem of "robust estimation" of abundance and recruitment in open CMR studies, it has many more general applications, some of which we will consider here. These generically revolve around situations in which marked animals can temporarily leave a study area, and thus are unavailable to capture. I contrast this situation to the one we encountered in Week 10, where animals become "available" via recruitment.
The "Robust" model formulation in MARK/RMark is set up to handle these situations. Again, if we are specifically interested in recruitment these models are not appropriate. Instead, use the Robust Design formulation of the Pradel or POPAN models, as discussed earlier.
The key parameters that require understanding in the RD are temporary emigration probabilities, and there are 2 types of these:
Gamma"(i)- the probability that an animal that was in the local population (i.e, available for capture) at i-1 moves off and is unavailable for capture at i (but is alive)
Gamma'(i)- the probability that an animal that was not in the local population at i-1 (and is unavailable for capture) remains outside the local population (and unavailable) at i . Again, it is alive.
Unless specified otherwise, both of these 'movement' conditions are temporary, meaning that the are are reversible, e.g., an animal can be move off with probability Gamma"(i) and remain off with probability Gamma'(i+1) or return with probability 1-Gamma'(i+1). This is the so-called 'temporary emigration' situation illustrated here in the first panel of the attached figure. As indicated, this can be thought of as a type of multistate model, where the states A and B represent "availability" and "non-availability", and Gamma" and Gamma' represent the probability of moving from A to B, or staying in B, respectively (second panel).
Modeling temporary emigration with the Robust Design can get extremely complicated, and requires a firm understanding of data structures, concepts, and model notation. If you are contemplating using the RD for this purpose, you absolutely must first have a firm grasp of the Chapter covering this material in the MARK book. From there, you should both read the up-to-date literature as well as practice on many worked examples. I will walk you through a reasonably simple RD -TE example (built into the RMark package) and a few models to give you a feel. Hang on.
The example is simply titled 'robust' and it really makes no pretense to be real data. Why? Because good RD data is so damned hard to find, that's why, and we know simulated data will work!.
The example data are loaded first by making sure we have the RMark package (use either "library" or "require") and then invoking a data command:
> #ROBUST example modified from RMark html help
> require(RMark)
> #load data example
>
> data(robust)
The first few lines of the data are
ch freq
1 000000111011101 1
2 000000111111000 1
3 000001000000000 2
4 000001001110100 1
5 000001010000000 1
6 000001100000010 1
7 000001100111100 1
8 000001101111100 1
9 000001110111000 1
10 000010001101111 1
11 000010110111110 1
12 000011000000000 2
which looks just like ordinary CMR histories. We are going to "robustify" this by telling RMark that some of our periods are open (over which stuff like mortality and emigration can happen) but others are closed (over which this stuff can't happen, but the population is closed so that we can estimate abundance). The MARK GUI has a tool to do this for you, but it's easy enough to do manually in RMark. There are 15 occasions, so we need to define 14 intervals:
> #define time intervals (closed if =0)
> time.intervals=c(0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0)
>
If you look at this carefully you should conclude that there are 4 open intervals (2-3, 4-5, 8-9, and 13-14, with the rest closed. So how many survival estimates should you expect in a time-specific model?
Ok, let's build some models. I am going to start with a simple model that assumes 1) that survival and temporary emigration are constant over time , and 2) that capture probabilities are time specific but the same as recapture (so 'shared') . We can make p not equal c by getting rid of the 'share' if we want:
>#build some models
>#basic constant S , gamma, and gamma'
>p.time.session=list(formula=~-1+session:time,share=TRUE)
>model.0=mark(data = robust, model = "Robust",
+ time.intervals=time.intervals,
+ model.parameters=list(p=p.time.session),threads=2)
A second model modifies the one above to allow 1) S to vary by time and 2) emigration to vary by time but according to the random emigration model. As noted in the MARK chapter this model basically says that you're as likely to leave as to enter the population, which means that gamma' and gamma'' are "shared" over time occasions.
> #
> # Random emigration, p=c varies by time and session, S by time
> #
> S.time=list(formula=~time)
> p.time.session=list(formula=~-1+session:time,share=TRUE)
> GammaDoublePrime.random=list(formula=~time,share=TRUE)
> model.1=mark(data = robust, model = "Robust",
+ time.intervals=time.intervals,
+ model.parameters=list(S=S.time,
+ GammaDoublePrime=GammaDoublePrime.random,p=p.time.session),threads=2)
Pause: what is this 'threads' argument in the mark command? It is an option (so, not needed) to tell MARK how many CPUs to use in processing, in this case 2. The default is -1, meaning 1 CPU will remain idle.
A third model takes the above model and makes S constant
> # Random emigration, p=c varies by time and session, S constant
> #
> S.dot=list(formula=~1)
> p.time.session=list(formula=~-1+session:time,share=TRUE)
> GammaDoublePrime.random=list(formula=~time,share=TRUE)
> model.2=mark(data = robust, model = "Robust",
+ time.intervals=time.intervals,
+ model.parameters=list(S=S.dot,
+ GammaDoublePrime=GammaDoublePrime.random,p=p.time.session),threads=2)
Finally, we can summarize our models with the collect.models() function
> collect.models()
model npar AICc DeltaAICc weight Deviance
2 S(~time)Gamma''(~time)Gamma'()p(~-1 + session:time)c()N(~session) 28 -18075.11 0.00000 9.989788e-01 3389.841
3 S(~1)Gamma''(~time)Gamma'()p(~-1 + session:time)c()N(~session) 25 -18061.34 13.77165 1.021127e-03 3409.668
1 S(~1)Gamma''(~1)Gamma'(~1)p(~-1 + session:time)c()N(~session) 23 -18040.13 34.98219 2.530868e-08 3434.912
This suggests that the second (most complex) model has the best data support. We can look at its parameters, focusing on the survival and emigration rates (first 8 rows of output):
> model.1$results$real[1:8,]
estimate se lcl ucl fixed note
S g1 c1 c1 a0 o1 t1 0.8857025 0.0131150 8.573747e-01 0.9090010
S g1 c1 c1 a1 o1 t2 0.8706091 0.0155836 8.368825e-01 0.8982105
S g1 c1 c1 a2 o1 t3 0.7698855 0.0217813 7.244595e-01 0.8097888
S g1 c1 c1 a3 o1 t4 0.8033136 6.3631895 2.137610e-34 1.0000000
Gamma'' g1 c1 c1 a0 o1 t1 0.0884428 0.0255535 4.954600e-02 0.1529618
Gamma'' g1 c1 c1 a1 o1 t2 0.1198046 0.0154473 9.267550e-02 0.1535315
Gamma'' g1 c1 c1 a2 o1 t3 0.1509028 0.0215888 1.132688e-01 0.1982454
Gamma'' g1 c1 c1 a3 o1 t4 0.1673500 6.5954991 1.030193e-41 1.0000000
*****************************************************
The above commands are contained in a script file, here.
There are a huge number of directions we can go from here. For instance, I have totally left out the finite mixture models that allow one to model individual heterogeneity in capture probabilities. These can be built in a way that is directly analogous to what we did in closed CMR.