Bird occupancy models

Overview of the application

Context

The predictive performance of any species distribution model (SDM) aiming to accurately predict temporal changes in species distributions should be evaluated using spatio-temporal data. Unfortunately, because high-resolution censuses covering large spatial extents are difficult to repeat, correlative SDMs are usually validated using data that was surveyed during the same time period as the data used for model building. The availability of temporally-replicated surveys is not only important for model evaluation, but also to estimate parameters in the model-building phase. It is possible to use a backward calibration strategy to calibrate the mechanistic parts of hybrid species distribution models. Maximum-likelihood approaches have been used to estimate parameters of species spread models from temporal observations (e.g. Smolik et al., 2010).

Long-term monitoring programs, such as those available for birds or butterflies, are normally conducted with the aim of estimating temporal trends in abundance for common and widespread species. However, they offer a great deal of spatio-temporal data that have the potential to be used to create maps showing changes in species distribution and species richness across the surveyed area (Jiguet et al., 2005). Like all observations, monitoring data are imprecise, subject of sampling variations, measurement errors and biases. Particularly, the amount of information related to the ecological process of interest will likely compromise the validity of monitoring data for dynamic model building and/or evaluation. A monitoring program covering large extents but with large lags between surveyed sites may be good to follow ecological processes causing large-scale distributional change, such as those derived from climate or land-use changes. However, such observational data are likely to fail to capture the species distribution dynamics occurring at scales smaller than the spatial lag. In contrast with coarse regional surveys, monitoring programs specifically focusing on sites where the relevant ecological processes occur potentially contain more useful information.

Objectives

In this application we compare the usefulness of two sources of bird-monitoring data for hybrid SDM parameter estimation and validation:

1. A common breeding bird survey (CBS) for Catalonia.
2. A monitoring program specifically analyzing bird colonization patterns in large wild fires (DINDIS project).

Landscape dynamic modeling

We generated environmental dynamics related to fire and succession processes as follows. First, we took information from available forest maps in order to generate environmental layers to be used as predictors in habitat suitability models. Some of these layers were dynamically updated: the amount of forested areas, stable non-forested open areas, shrubland and non-forested open areas induced by fire. We used the MEDFIRE spatially explicit landscape simulation model (ver. 3, see Model archives) in order to update these dynamic environmental layers from 2000 to 2009. Since we knew the 2000-2009 historical fire sequence from available spatial databases, we used this information to determine areas impacted by fire disturbance. The succession sub-module was used to track post-fire transitions and forest maturation.

Hybrid species distribution modeling

Habitat suitability modeling

Pure-environmental habitat suitability (PEHS)

We fitted generalized linear models with binomial error on CBBA data using both static and dynamic environmental layers as predictors. To avoid departures from equilibrium in the training data, we calibrated these “pure-environmental” habitat suitability (PEHS) models using observations not further away than 10 km from a recorded presence. Starting from a model with all landscape variables, we conducted a backward selection based on Akaike Information Criterion (AIC) (i.e. each step we removed the variable leading to a larger decrease in AIC until all removals produced higher AIC values).

Habitat suitability (HS)

We combined IPO with changes in PEHS to track the effect of landscape dynamics on habitat suitability. That is, we used IPO as an approximation to initial habitat suitability (HS₀ = IPO) and use changes in PEHS to indicate increases or de creases of HS for a given species environmental relationship hypothesis. This leads to the following expression for habitat suitability (HS) at time t > 0:

HS_t = HS_t_-1 + (PEHS_t – PEHS_t_-1)

where PEHS_t is the value of the pure-environmental habitat suitability model at time t. In order to arithmetically operate with the values of the IPO and PEHS, both models have to be based on probabilities. In the unlikely situation where IPO = PEHS₀, the pure environmental model would account for the whole species spatial structure and the equation above simplifies to HS_t = PEHS_t.

Occupancy modeling

Occupancy model details

We modeled species distributional changes using a simple grid-based occupancy model that updates the probability of occupancy of cells taking into account the changes in habitat suitability and the spread capabilities of the target species. As input, the procedure needs initial values for the probability of occupancy of each cell (P₀=IPO), the HS values corresponding to all time steps, and the values of three spread parameters. Although it deals with probability values, the model is completely deterministic. That is, for a given starting probability of occupancy, the model gives the probability of occupancy at the next time step. The probability of occurrence of the species at cell i and time t is given by:

P_i,t = min(HS_i,t, P_i,t_-1+ (1 – P_i,t-₁)·C_i,t)

where C_i,t is the rate of colonization from neighboring populations. In general, decreases in HS automatically reduce the probability of occupancy but increases in HS lead to increases in occupancy only if C_i,t > 0. The value of C is modeled much like in the incidence function model (Hanski, 1994). First, a connectivity value for target cell is determined by using the neighboring occupancy values and a kernel that relates the amount of dispersers with distance. For this, we used the extended negative exponential kernel, which is similar to the negative exponential kernel but is more flexible because it allows fat-tailed distributions. The formula for the connectivity value for cell i at time t is given by:

Where P_j,t_-1 is the probability of occupancy at time t – 1 of cell j, which belongs to a set of neighbors of the focal cell i, d_ij is the distance between cells i and j, and alpha is a parameter indicating the rate of exponential decrease. Large values of alpha allow short-distance colonization only, whereas small alpha values allow long-distance colonization events. Parameter beta further modulates the shape of the kernel. When beta = 1 the kernel is a negative exponential kernel. When beta < 1 (resp. beta > 1) the tail of the kernel will be longer (resp. shorter) than the one of the negative exponential kernel. For computational reasons, a bounding box of limited size is used to delimit W. The size of the bounding box is automatically adjusted using the value of alpha. The rate of colonization C is determined using an increasing sigmoid function of connectivity:

C_i_,_t= S²_i_,_t /(S²_i_,_t+y²)

where y is a parameter that relates connectivity to the rate of colonization: y is equal to the connectivity value that yields C = 0.5. For the same connectivity value, a smaller y value increases the rate of colonization, and hence the spread rate of the species in the model.

Maximum-likelihood estimation of spread parameters

We used monitoring data to estimate the spread parameters of our occupancy model. More specifically, we searched for the parameter estimates that maximized the likelihood of the occupancy model given the presences and absences recorded in the monitoring data (e.g. Smolik et al., 2010). We tested three data sets for maximum likelihood estimation:

1. CBS data for the 2002-07 period
2. DINDIS data for the 2006-07 period
3. A combination of the previous two data sets.

Maximum likelihood optimization was performed using gradient methods and setting initial parameter values to alpha = beta = y

= 1. In order to characterize the resulting kernels, we calculated the distances that accounted for 50% and 95% of kernel values (D50 and D95 respectively). We assessed overall uncertainty in parameter estimation by calculating the determinant of the Hessian matrix at the maximum likelihood value.

Model execution and validation

NOT FINISHED

Model results

NOT FINISHED

References

- Hanski, I. (1994) A practical model of metapopulation dynamics. Journal of Animal Ecology, 63, 151-162.
- Jiguet, F., Julliard, R., Couvet, D. & Petiau, A. (2005) Modeling spatial trends in estimated species richness using breeding bird survey data: a valuable tool in biodiversity assessment. Biodiversity and Conservation, 14, 3305-3324.
- Smolik, M. G., Dullinger, S., Essl, F., Kleinbauer, I., Leitner, M., Peterseil, J., Stadler, L. M. & Vogl, G. (2010) Integrating species distribution models and interacting particle systems to predict the spread of an invasive alien plant. Journal of Biogeography, 37, 411-422.