The maximum likelihood approach implemented in the R secr package provides a great deal of flexibility, and will work well for many basic estimation problems involving spatial CMR data. However, additional flexibility can be gained by using a Bayesian approach, and is especially valuable for:
Incorporation of random effects and other hierarchical structure
Dealing with missing or sparse data in some parts of the study
Incorporating spatial and temporal covariates that vary among the trap locations
Modeling spatial and temporal variation in density
Predicting density in relation to spatial covariates
Data analysis integrated across multiple data types/ studies
The basic approach is an extension of ideas from non-spatial Bayesian CMR analysis, with the following major features:
Abundance is modeled by data augmentation, so that the matrix of individual animals observed once or more is "padded" by an arbitrary number (say, 2 times n.obs), for M total "potential" animals. The problem then becomes estimation of psi, the proportion of the M "animals" that actually occur (analogous to site occupancy)
The M individuals are used to generate a distribution of home range centers in a "state space", often according to a simple process (e.g., bivariate uniform).
The distance from each "individual's" home range center to a trap location is treated as a random effect used as a covariate to model the probability of detection of individual i at detector j, with p0 representing perfect detection and sigma a scale factor for a half-normal detection function
As in occupancy and ordinary CMR with data augmentation, the latent state z[] takes value z[i]=1 if animal i is know to occur (was detected at least once) , so is automatically 1 for the n.ind animals detected one or more times. z[] is stochastically assigned values of 0 or 1 for the remaining M-n.ind individuals, consistent with the updated values of the model parameters (psi, p0, and sigma)
Abundance is then computed N <-sum(z[]) and density as D<-N/area, where area is the area of the state space (usually, the trap area plus +/- a buffer).
Not surprisingly, Bayesian SCR tends to be computationally slower (sometimes hugely so) than the ML methods in secr. Therefore, problems that are readily tractable by secr should probably be solved in that framework. However, I have encounter a number of problems that either require or are more easily conducted in Bayesian SCR, notably the trap- and occasion-specific influence of hogs in white-tailed deer detection at SRS.
Construction of the Bayesian model in JAGS
To understand the Bayesian formulation of spatial CMR, we can start by what amounts to an inversion of the spatial point process model we previously considered. In that formulation,
We have a fixed sampling (trap) grid
There was a fixed home range centroid (this could be extended to multiple animals)
Animals occurred / were detected at points on the grid according to a half-normal detection function of distance from the centroid, and a Poisson or binomial process
Now we have problem with some of these same elements, but others changed:
Fixed sampling grid
Unknown home range centroids for
n animals captured at least once
N-n animals never captured
Animals are detected according to a half normal function of distance from the centroids to the trap locations, but that distance is now a random effect that is not completely observable (and must be modeled)
Simulated example
The simulation and JAGS estimation code here reveals how we think the data arises, and how the model obtains inference on parameters
A grid is specified, e.g., -1000 to 1000 m on the X and Y coordinates with 500 m spacing (25 sampling loci) . A buffer is added (e.g., 200 m) and the buffered grid is referred to as the state space with area A
A density D is specified resulting in abundance N <-A*D for the state space
A spatial scaling parameter sigma is specified, e.g., sigma =500 m
N is "padded" by some multiplier f (e.g., 2) to obtain M<-N*f, the number of "potential capture histories"
psi<-N/M specifies the probability that a given "animal" is in the population (1) or not (0)
z[i], i = 1, M is modeled as z[i]~Bernoulli(psi)
The home range centroid for each animal is modeled as uniform random variables x[i]~Uniform(Xl, Xu), y[i]~Uniform(Yl,Yu), where Xl, Xu, Yl, Yu are the corners of the state space
For trap locations within the state space, the probability of detection for each "animal" i= 1, ... ,M is p(d[i,j])z[i], where p(d[i,j]) is the probability of capture for animal i at trap j, as a function of d[i,j], the distance of that trap to the random home range centroid, and z[i] is the occurrence state of the animal (obviously, animals that are not present have no chance of capture).
The capture probabilities are then used to model captures as Bernoulli or (in the case where capture is homogeneous over k replicates, Binomial outcomes y[i,j]~Binomial(k,p[i,j])
The Bayesian model essentially inverts the simulation process
Priors are placed on psi, sigma, and p0
Captures y[i,j] are observed on the grid; the rows of the capture matrix are padded with zeros (as with non-spatial CMR) by some factor for M rows total
The state variable z[i] is modeled as Bernoulli(psi) outcomes
The (unobservable) home range centroids are modeled as uniform outcomes over the state space.
Probability of capture is modeled as a function distance of trap[j] from home range centroid[i,j] for the M "animals" and j trap locations, conditional on z[i]=1
This probability is used to model the data as binomial outcomes as above.
A simple model (all parameters constant over time, no other sources of heterogeneity) converges quickly on estimates that are close to the true parameter values (D=0.5, sigma=5,p0=0.04. Note in the code (both the simulation and model) that the coordinates are scaled by 100m so that sigma =5 corresponds to 500 m; areas a similarly scaled, so that D is in terms of animals /ha rather than per m2 as in the original units).
> summary(scr0)
JAGS output for model 'scr0.txt', generated by jagsUI.
Estimates based on 3 chains of 10000 iterations,
burn-in = 5000 iterations and thin rate = 1,
yielding 15000 total samples from the joint posterior.
MCMC ran in parallel for 17.429 minutes at time 2016-08-03 08:21:44.
mean sd 2.5% 50% 97.5% overlap0 f Rhat n.eff
D 0.531 0.000 0.531 0.531 0.533 FALSE 1 1.002 4975
sigma 5.010 0.062 4.890 5.009 5.135 FALSE 1 1.001 2630
N 306.080 0.281 306.000 306.000 307.000 FALSE 1 1.002 4975
p0 0.398 0.009 0.380 0.397 0.416 FALSE 1 1.001 3366
psi 0.708 0.022 0.665 0.708 0.749 FALSE 1 1.000 15000
deviance 7781.307 35.784 7712.491 7780.625 7851.596 FALSE 1 1.001 3829
Successful convergence based on Rhat values (all < 1.1).
Rhat is the potential scale reduction factor (at convergence, Rhat=1).
For each parameter, n.eff is a crude measure of effective sample size.
This is the simplest possible spatial CMR model, and there are many variants for the closed case. Important ones include
Home range centroids may not be uniformly distributed over the state space, e.g., there may be habitat or other effects dictating a more complex model
Capture probabilities may vary over replicate occasions or animals (as in non-spatial CMR).
Animal distribution/density or capture probabilities may be influenced by measurable factors that could be included as covariates.
All of the above and more can be implemented, with varying degrees of difficulty, within the JAGS architecture by modifying the basic model.
Possum example
Here I apply JAGS to the possum example. For generality, the JAGS (e.g., to allow modeling of capture variation over occasions) the detection data must be reformatted from secr-type format (matrix of animal by occasions with the entries the trap numbers) to a 3-dimensional array (animal X trap location index X occasion). I have included a small function that does this conversion for the possum data. The re-formatted data then submitted to another R function SCR0.u that sets up the analysis in JAGS, after first using data augmentation to create arrays that have M=300 as the number of "individuals" (observed + potential). The function includes the model script and a method for obtaining initial values for the model parameters (psi, p0, and sigma for this simple model). Abundance (N) and density (D) are obtained as predicted or derived parameters (i.e., as functions of other parameters in the model).
The Bayesian SCR model produces posterior estimates very close to those produced by secr ML methods (~1.8/ha). Note though that the posteriors of psi and N push up against the boundary conditions of 1 and 300, respectively, suggesting that to be on the safe side M should probably be increased (from 300 to say 350 or 400).
Next: Bayesian open models