Heterogeneity can occur in detection probabilties over either time or space for a variety of reasons. For example, we have already seen how to model generic heterogeneity over time by a time effect; we could just as easily have model detection as a function of a time-specific covariate like whether or ambient noise.
Detection heterogeneity can also occur over space, so among sites. One reason for this could be variations in habitat conditions that render animals more detectable in some habitats than others. Another could be that the abundance (density) of animals varies among sites. Because occupancy detection is based on detection of 1 or more animals, all other things being equal it is more likely to miss animals on sites with few animals than those with many animals. Depending on which type of heterogeneity is thought to operate, and what is being recorded in the study, we can approach modeling site heterogeneity in a variety of ways:
Generic (unspecified) heterogeneity among sites in detection may be modeled using either finite mixture models (an approach implemented in MARK) or random effects models (e.g., as implemented in WinBUGS). We will only briefly touch on these approaches in this workshop.
Covariates can be used to model heterogeneity, but only if covariates both explain significant heterogeneity, and are measured on all or a sufficient number of sites. Because covariate modeling for detection works just like any other site-specific covariate (for predicting either occupancy or detection) we have already seen generically how to do this and won't consider it further.
Heterogeneity can be modeled as a function of abundance, which is the approach taken in the Royle models. In turn there are 2 basic approaches:
Abundance is not observed directly but is inferred via the detection histories
Abundance is observed, usually via incomplete counts.
In either of the above cases, abundance is now a parameter in the model, and occupancy reduces to a mapping of the abundance state on each site to a binary variable I(N>0).
Royle- Nichols model
The Royle-Nichols model (J. A. Royle and J. D. Nichols. Estimating abundance from repeated presence-absence data or point counts. Ecology, 84(3):777-790, 2003) is based on the fairly straightforward idea that in order for detection to not occur on a site, none of the N i animals present can be detected. If each animal has a probability r of detection, then overall detection probability is given by:
p = 1- (1-r)Ni
The data for this model are exactly the same as for the standard occupancy model, but the model operates under the above transformation relating abundance to detection via the parameter r. The model is implemented in the function occuRN(), which has syntax just like occu() except the fundamental parameters are r and N instead of p and Ψ . Like occu() we can model these parameters via covariates. Below we read in the data and perform parallel Royle-Nichols analyses, corresponding to the first few models we fit in occu(). Note that r is on the logit scale (like p was) but now N is on the log scale. Following each model fit we can produce back transformed estimates (or predictions if covariates are involved). There is also a provision for an empirical Bayes estimate for the probability distribution of abundance on each site.
rm(list=ls())
> #install unmarked if not installed
> #install.packages("unmarked")
> #load library
> library(unmarked)
> #set here the directory where your data files are located
> #data_dir<-"C:/Users/mike/Dropbox/company/pacblackduck/workshop/occupancy"
> #data_dir<-"C:/Documents and Settings/conroy/My Documents/Dropbox/company/pacblackduck/workshop/occupancy"
> data_dir<-"C:/mydir"
> setwd(data_dir)
>
> data<-read.csv("blgr.csv")
> #detection data rows are sites columns are detection replicates
> y<-data[,2:4]
> n<-nrow(data)
> #site level (individual) covariates
> blgr.site<-data[,5:9]
>
>
>
> #create time factor and use as covariate
> #observation level (time specific) covariates
> time<-as.factor(rep(c(1,2,3),n))
> blgr.obs<-data.frame(time)
>
> #put everything together in unmarked data frame
> #note that covariate can come from separate files
> blgr <- unmarkedFrameOccu(y = y, siteCovs = blgr.site,obsCovs=blgr.obs)
> #summary of unmarked data frame
> summary(blgr)
unmarkedFrame Object
41 sites
Maximum number of observations per site: 3
Mean number of observations per site: 3
Sites with at least one detection: 33
Tabulation of y observations:
0 1 <NA>
63 60 0
Site-level covariates:
field.size bqi crop.hist crop1 crop2
Min. : 6.40 Min. :0.0000 crop :28 Min. :0.0 Min. :0.0
1st Qu.: 9.60 1st Qu.:0.0000 grass: 4 1st Qu.:0.0 1st Qu.:0.0
Median :12.70 Median :0.0000 mixed: 4 Median :1.0 Median :0.0
Mean :16.16 Mean :0.3415 NA's : 5 Mean :0.7 Mean :0.2
3rd Qu.:18.80 3rd Qu.:1.0000 3rd Qu.:1.0 3rd Qu.:0.0
Max. :53.90 Max. :1.0000 Max. :1.0 Max. :1.0
NA's :1 NA's :1
Observation-level covariates:
time
1:41
2:41
3:41
>
> #CREATING MODELS
> #Royle-Nichols model with no covariates
> rn1<-occuRN(~1 ~1,blgr)
>
> #back transformations
> backTransform(rn1,'det')
Backtransformed linear combination(s) of Detection estimate(s)
Estimate SE LinComb (Intercept)
0.22 0.114 -1.27 1
Transformation: logistic
> backTransform(rn1,"state")
Backtransformed linear combination(s) of Abundance estimate(s)
Estimate SE LinComb (Intercept)
3.05 1.62 1.12 1
Transformation: exp
>
>
> # Empirical Bayes estimates of abundance at each site
> re <- ranef(rn1)
Warning message:
In .local(object, ...) :
You did not specify K, the maximum value of N, so it was set to 50
> plot(re)
>
> #create some more occupancy models
> #constant detection, constant occupancy
> rn1<-occuRN(~1 ~1,blgr)
> backTransform(rn1,"state")
Backtransformed linear combination(s) of Abundance estimate(s)
Estimate SE LinComb (Intercept)
3.05 1.62 1.12 1
Transformation: exp
>
> #time specific detection, constant occupancy
> rn2<-occuRN(~time ~1,blgr)
> backTransform(rn2,"state")
Backtransformed linear combination(s) of Abundance estimate(s)
Estimate SE LinComb (Intercept)
2.99 1.54 1.09 1
Transformation: exp
>
> #constant detection, abundance predicted by bqi
> rn3<-occuRN(~1 ~bqi,blgr)
> backTransform(linearComb(rn3, c(1, 0), type="state"))
Backtransformed linear combination(s) of Abundance estimate(s)
Estimate SE LinComb (Intercept) bqi
3.46 1.94 1.24 1 0
Transformation: exp
>
Royle count model
The Royle count model (J. A Royle. N-mixture models for estimating population size from spatially replicated counts. Biometrics , 60(1):108-115, 2004) uses repeated counts and mixture models to estimate abundance. Because it uses counts rather than simply detection- non detection data, the method relies on few assumptions about the distributional pattern of the surveyed population, although the core of the model (the mathematical relationship between p and r through N) remains the same. The mechanics of the Royle count method are similar to those of the Royle-Nichos method, but require specifying a unmarked PCount data frame,which reads the actual counts; in this example the counts are contained in columns 10-12 of the input file.
> ###############COUNT MODELS
> rm(list=ls())
> data<-read.csv("blgr.csv")
> #detection data rows are sites columns are detection replicates
> counts<-data[,10:12]
> n<-nrow(data)
> #site level (individual) covariates
> blgr_count.site<-data[,5:9]
>
> #create time factor and use as covariate
> #observation level (time specific) covariates
> time<-as.factor(rep(c(1,2,3),n))
> blgr.obs<-data.frame(time)
>
> #observation level (time specific) covariates
> blgr_count.obs<-data.frame(time)
>
>
> #put everything together in unmarked data frame
> #note that covariate can come from separate files
> blgr_count <- unmarkedFramePCount(y = counts, siteCovs = blgr_count.site,obsCovs=blgr_count.obs)
> summary(blgr_count)
unmarkedFrame Object
41 sites
Maximum number of observations per site: 3
Mean number of observations per site: 3
Sites with at least one detection: 33
Tabulation of y observations:
0 1 2 3 4 <NA>
62 29 19 8 5 0
Site-level covariates:
field.size bqi crop.hist crop1 crop2
Min. : 6.40 Min. :0.0000 crop :28 Min. :0.0 Min. :0.0
1st Qu.: 9.60 1st Qu.:0.0000 grass: 4 1st Qu.:0.0 1st Qu.:0.0
Median :12.70 Median :0.0000 mixed: 4 Median :1.0 Median :0.0
Mean :16.16 Mean :0.3415 NA's : 5 Mean :0.7 Mean :0.2
3rd Qu.:18.80 3rd Qu.:1.0000 3rd Qu.:1.0 3rd Qu.:0.0
Max. :53.90 Max. :1.0000 Max. :1.0 Max. :1.0
NA's :1 NA's :1
Observation-level covariates:
time
1:41
2:41
3:41
>
Once the data frame is set, analyses proceed in much the same fashion as before, but now using the command pcount():
> #CREATING MODELS
> #Royle count model with no covariates
> pc1<-pcount(~1 ~1,blgr_count)
Warning message:
In pcount(~1 ~ 1, blgr_count) : K was not specified and was set to 104.
> backTransform(pc1,"state")
Backtransformed linear combination(s) of Abundance estimate(s)
Estimate SE LinComb (Intercept)
4.44 1.69 1.49 1
Transformation: exp
> #time specific
> pc2<-pcount(~time ~1,blgr_count)
Warning message:
In pcount(~time ~ 1, blgr_count) : K was not specified and was set to 104.
> backTransform(pc2,"state")
Backtransformed linear combination(s) of Abundance estimate(s)
Estimate SE LinComb (Intercept)
4.41 1.67 1.48 1
Transformation: exp
>
The approach provides very similar estimates of per-animal detection r, but somewhat higher (but with CIs overlapping) estimates of N for the series of comparable models with and without covariates.
R script to conduct the above analyses