Julien Mainguy, fisheries biologist, Ph.D. Rafael de Andrade Moral, Associate Professor
Québec, Québec, Canada Department of Mathematics and Statistics
Maynooth University, Maynooth, Ireland
[Last modified: 2024-11-09]
Note: the ideas behind the content of this website have been published in Mainguy & Moral (2021) and Mainguy et al. (2024).
Several methods exist to estimate the instantaneous mortality (Z) from age-frequency data (i.e., catch curve) to then back-transform Z to total annual mortality (A) for fisheries management purposes. The most-commonly used methods for the analysis of a catch curve can roughly be divided into three main approaches:
1) Log-transformation of the counts to analyze them in a linear modelling framework with a Gaussian error structure:
The linear regression (LR) method of Ricker (1975);
The weigthed linear regression (WLR) method of Maceina & Bettoli (1998), i.e. a weighted version of the LR method;
2) A maximum-likelihood approach for the estimation of total annual survival (S) from which A is deduced since A = 1 - S:
The well-known method of Chapman & Robson (1960), hereafter the CR method;
The CR method with Correction for Bias (CRCB) of Smith et al. (2012) that adjusts the estimated SE with a correction factor to account for overdispersion (variance > mean) as this analytical situation is often encountered when modelling count data in ecological studies (Richards, 2008);
3) A Generalized Linear Modelling (GLM) framework based on the influential work of Millar (2015):
The "Poisson Model" (PM) of Nelson (2019) which is based on a GLM fitted with the Poisson distribution family that has been designed for the specific modelling of counts and which also applies a correction factor to account for overdispersion, similarly as for the CRCB method;
The GLM-Based (GB) method of Mainguy & Moral (2021) that makes use of some of the available extensions of the Poisson distribution family to account for how the counts are dispersed. Basically, when counts are equidispersed (variance = mean), a GLMPoisson should be used since it has been designed for this specific analytical situation regarding count data dispersion (Hilbe, 2014). This "classical" distribution family can also often appropriately handle underdispersed (variance < mean) counts (Zuur et al., 2009). In many situations, however, counts are either more prominantly over- or underdispersed and in such cases, different extensions of the Poisson distribution family have been developed to better account for this. Each of these extensions has an additional, specific dispersion parameter that models the variance as a function of the mean, thus generally allowing to better estimate the central tendency and, as importantly, its associated variance.
Anadromous Nunavik Arctic charr (Salvelinus alpinus) during the upstream migration in the Aipparusik (Bérard) River, Tasiujaq, Québec, Canada. Catch-curve analyses conducted by Mainguy & Moral (2021) revealed that the counts in the descending limb of the catch curve tended to be or were underdispersed for this species, as opposed to walleye (Sander vitreus) counts from inland lakes located further south that were overdispersed instead in most instances.
Photo credit: Pascal Ouellet, MELCCFP
Regardless of the considered method, the central idea is to estimate the rate at which counts decrease with age on a logarithmic scale to determine the absolute value of the age coefficient (i.e., the slope) from this relationship which corresponds to Z. This analysis is only performed for the "descending limb" of the investigated catch curve, as most of the time the first age classes are only partially recruited by the fishing gear. The related assumptions for such analyses, which are quite strict and often only partially met at best can be briefly described as follow:
the population being sampled is closed to emigration and immigration;
recruitment is constant or at least varies without a trend over time;
Z is constant across ages and years for the age classes on the descending limb of the catch curve;
the probability of capture (i.e., vulnerability to the fishing gear) follows the same conditions as for Z;
the sample is not biased regarding any specific age-group.
More details about these assumptions can be found elsewhere (Smith et al., 2012; Ogle, 2016). Essentially, catch-curve analyses are known to be biased to some degree (Murphy, 1997; Nelson, 2019) and as such, often somehow constitute "the best of a bad job" to estimate mortality rates, as better alternatives exist - although these are often far more effort-intensive, such as the use of capture-mark-recapture (CMR) analyses using different individual-based monitoring approaches (e.g., Bradshaw et al. 2007; Caza-Allard et al., 2021; Pierce et al., 2021). However, when one recourses to catch-curve analyses, Mainguy & Moral (2021) recommended that: "[...] efforts should be made to use the least biased method for the estimation of Z and its associate error given the known limitations of this commonly used approach".
When monitoring an exploited fish stock over time in a fisheries management context, one can estimate the sample-specific Z of a series of surveys that are spaced in time to determine whether A has remained apparentely stable or not. As Z is estimated from the slope describing the rate at which counts decrease with age, relatively similar slopes as time progresses across consecutive samples would thus likely indicate an apparent stability over time for A (see Miranda & Bettoli, 2007). On the other hand, if the slopes gradually become steeper with time, this would instead be indicative of an increasing mortality rate due to a gradually changing age structure of the fish population being monitored. Therefore, even if biased to some degree, catch-curve analyses will likely be able to capture such a signal by statistically detecting a change in the descending limb and if so, assess in which direction to inform about the underlying trend in mortality rates.
Below is a fictive example to illustrate the general principle behind a catch-curve analysis. In this figure, the empty (white) vertical bars represent the age structure of the fictive population as a whole found in a given lake and from which we would like to estimate A from Z for a single sample, where total annual survival (S) is exp(-Z) and thus A = 1 - exp(-Z), since A + S = 1 (see Miranda & Bettoli, 2007). Although not discussed here, Z can be further decomposed in M (natural mortality) and F (fishing mortality). As such, in a population where F = 0, the estimated Z would directly reflect M. The filled (black) vertical bars in the figure below represent all the fish that were sampled by the fishing gear. In this fictive case, it is as if we would have captured all the fish that could have possibly been sampled. For instance, even though there were N = 500 fish of 0+ age, only n = 50 were captured by being probably the largest ones in this cohort, whereas all the other were possibly too small to be sampled by the smallest mesh size of a gill net for instance. The black bars below are thus what we "know", keeping in mind that the white bars reflect the fictive population that we are sampling but for which we will never know the true frequency distribution of the different age classes. In real-life situation, we're thus only dealing with the black bars below, i.e. the age structure of our sample.
From this catch curve, we can tell that the highest number of sampled fish (i.e., the mode) was for the 2+ age class. We can also tell that a few 2+ fish were not caught by the fishing gear, whereas starting at 3+ and older the fish present in the population have all been captured. The 2+ age class corresponds to the "Peak" and has commonly been used as the starting point of the descending limb of a the catch curve to estimate Z, especially with the LR and WLR methods, and originally with the CR method as well. However, Pauly (1984) recommended using the age class following the "Peak", which is often referred as "Peak Plus" to avoid having an over- or under-representation of the first age class included in the analysis as the "fully-recruited age class by the fishing gear", which Smith et al. (2012) have later also recommended for the CR and CRCB methods. This would be the case here, with less fish than there actually are for the 2+ age class. Let's keep in mind again that we have no information about the true age structure of all the fish found in this lake for our studied species of interest. We simply hope that with a random sample of this population, we're capturing the same signal regarding Z under the assumption that the specimens are captured proportionally to their frequencies within the population being studied.
Smith et al. (2012) have made four noteworthy recommendations regarding catch-curve analyses:
(1) the CR method should be preferred using the "Peak Plus" criterion;
(2) the CR variance estimator should be corrected for overdispersion (i.e., the CRCB method);
(3) the (unweighted) LR method should no longer be used;
(4) the Heincke (1913) method generally performs poorly and should in most cases be disregarded.
Although the LR method is described as an approach to estimate Z and A in fisheries books (Miranda & Bettoli 2007; Allen & Hightower, 2010; Ogle, 2016), can also be easily estimated with fisheries-oriented R packages such as fishmethods (Nelson, 2023) and FSA (Ogle et al., 2022), and is therefore still sometimes used nowadays (e.g., Anderson et al., 2022; Yamamoto & Katayama, 2022), we believe that it should be abandonned. An exception for its use could however arise when historical estimates that were obtained with this sub-optimal method would need to be compared on the same basis to more recent estimates without having access to the original data. This being said, the main message is that count data should not be log-transformed in the first place for statistical analyses because appropriate error structures are available for discrete variables (O'Hara & Kotze, 2010; Hilbe, 2014). If one choose to still use the LR method for catch-curve analyses because it's somehow "fast and easy" to be performed, the estimation of Z will often be unnecessarily downwardly biased when compared to the better performing CR method for instance (Dunn et al., 2002; Smith et al., 2012). Accordingly, Mainguy & Moral (2021) found with simulations under different count dispersion scenarios, including the most-likely encountered one with ecological data which is referred to as "quadratic overdispersion", i.e., variance > mean as the variance is a quadratic function of the mean (e.g., Lindén & Mäntyniemi, 2011), that using the LR method for catch-curve analyses will substantially underestimate the "true" value of Z in all cases. Using instead a GLM with the proper error structure (e.g., negative binomial type II: NB2) for quadratically overdispersed data would provide the closest estimation to the "true" Z value used. Furthermore, zero counts cannot be log-transformed and are thus often ignored with the LR method and even worst, the oldest ages past the first encountered age class with a zero count are also sometimes discarded when n < 5 for instance (e.g., Allen & Hightower, 2010), creating additional unnecessary bias in the catch-curve analyses by not maximizing the available biological information from the sample. This is because not capturing a 10-year-old specimen in a sample is as much informative as capturing four 9-year-olds and one 11-year-old. Moreover, to be statistically valid, a linear model obtained from the LR method requires that (a) its standardized residuals are sufficiently normally distributed according to a test such as the one of Shapiro & Wilk (1965), and (b) homoscedasticity is also respected according to a test such as the one of Breusch & Pagan (1979), knowing that such parametric test assumptions are not always met and often not even actually verified. The bottom line is that there is no longer a need to try to fit the response variable (i.e., count data = discrete variable) with a normal error structure through diverse data transformations; see for instance the relevant section "Pitfalls in the Application of Statistics" of Steel et al. (2013). See also St-Pierre et al. (2018) who have shown that the use of error structures that were designed to model count data in ecology is generally preferable to data transformation and as such, the first approach (i.e., model reformation) is being increasingly used in recent years at the detriment of the other (i.e., data transformation). Furthermore, although one may be able to estimate Z and its associated SE according to a method or another, such estimation become more useful when one can compare it to other estimates from a statistical standpoint. When 95% confidence intervals are adequately estimated to then be compared, they will correctly indicate statistical support for a difference when they don't overlap. However, the story becomes much more difficult to interpret when they partially overlap, which is a commonly-encountered situation in ecology. When this occurs, a statistically-supported difference may still prevail (Schenker & Gentleman, 2001) and as such, comparing 95% CIs does not actually constitute a rigourous statistical test and even more so when more than two samples are compared. We'll come back later about statistical inferences regarding the comparison of two or more estimates for situations in which either a categorical or a continuous independent variable is considered. We will for now begin with the estimation of A and its associated uncertainty from a single fish sample.
Towards more adequate catch-curve analyses
Millar (2015) reported that the CR method is actually yielding similar results to those of a GLM fitted with a Poisson family distribution and a log link. As mentioned above, to generate valid statistical inferences such GLMPoisson requires that the count data are sufficiently equidispersed (Hilbe, 2014), which sometimes do occur but not always with ecological data. The PM of Nelson (2019) was built on this finding and applies a similar correction to that of the CRCB method to take into account the fact that the age-frequency data are often overdispersed. This approach clearly constitutes an improvement over the previous methods discussed above, as one can for instance use a GLM framework to also add covariates that may improve the fit to the observed count data that are being modelled. A downside however of the PM approach, as for the CRCB method, is that the correction for overdispersion bias is applied once the model has been generated to mainly correct the SE and thus, not as part of the modelling of the data per se. It nevertheless provides a better estimation of the variance, but remains of little practical use for statistical inferences other than allowing to compare the (more often larger and adequate) derived 95% CI obtained through the use of a correction factor.
Mainguy & Moral (2021) and Mainguy et al. (2024) have rather proposed to make use of different existing extensions of the Poisson distribution family that have actually been specifically designed to model any over- and underdispersion in the counts being analyzed, when present. Such extensions often yield more accurate estimations for the central tendency (Z) and, as importantly, the related uncertainty (SE) obtained. As discussed above, each Poisson extension has an error-spectific dispersion parameter to model the variance which is actually described as a function of the mean (see page 1453 of the Appendix of Mainguy & Moral 2021). As such, one extension may better perform than another one under some specific conditions, such as the Generalized Poisson (GP; Consul, 1989) and negative binomial type I (NB1; Hilbe, 2014) that will likely be more adequate when the variance is a linear function of the mean when compared to the NB2 and CMP that better model quadratically overdispersed counts (Mainguy & Moral, 2021) . See Ver Hoef & Boveng (2007) for a convincing (non-fisheries) example demonstrating that using the appropriate error structure when modelling ecological count data is crucial to avoid sometimes quite substantial biases in the resulting predictions. Overall, the GB method allows to better statistically assess for differences between groups (i.e., categorical variable - see Mainguy & Moral, 2021 for empirical examples regarding walleye and Arctic charr) or associations with a continuous variable such as YEAR for a temporal trend (as demonstrated further below). A strenght too of the proposed approach is the capacity to test for model adequacy with the hnp package (Moral et al., 2017) prior to identify the top-retained candidate model(s) in a second step amongst those found to be adequate through an information-theoretic approach (Burnham et al., 2011). One can also calculate the deviance explained (D 2, Guisan & Zimmermann, 2000) of a log-linear model, which is an analogue to the coefficient of determination (R 2) for the linear model case, to determine how close to a "perfect" (i.e., saturated) model the one considered may be, thus approximately quantifying the variation (i.e., deviance) explained, but for this dataset only. An adjusted version of D 2 to account for both the number of parameters in a model and the sample size (n) being analyzed is also available (Guisan & Zimmermann, 2000) following Weisberg (1980). The idea is to avoid overfitting when this "calibration" metric is used, which is a known drawback of R 2-metric (Babyak, 2004) that also applies to D 2. It is important to note that a high D 2 (or adjusted D2) does not necesseraly indicate that the considered model is adequate, as model explanatory power and adequacy are two different things. An adequate model that is the best supported among a set of candidate models based on the AICc could for instance have a D 2 of just 4.6% and as such, any statement about the model predictions would need to be tone down due to its low approximate explanatory power.
Walleye catch curve example analyzed with the GB method of Mainguy & Moral (2021)
Walleye (Sander vitreus). Photo credit: Hugo Mercille, MELCCFP
From an ongoing provincial monitoring program, a total of 703 walleyes were sampled in 2012 in the Baskatong reservoir. All the individuals were aged based on their otoliths. We will refer below to the dataset containing these individual ages (1 line: 1 individual and its age) as saviBASK12. The reason for using savi comes from the scientific name for walleye: Sander vitreus. To first examine the corresponding age-frequency data based on the variable AGE in the dataset, one can use the summarytools package (Comtois, 2022) and its freq() function that is useful to regroup count data according to a specified "category" which is each AGE class in our case:
library(summarytools)
freq(saviBASK12$AGE)
Frequencies
saviBASK12$AGE
Type: Integer
Freq % Valid % Valid Cum. % Total % Total Cum.
----------- ------ --------- -------------- --------- --------------
0 38 5.41 5.41 5.41 5.41
1 281 39.97 45.38 39.97 45.38 ## "Peak"
2 151 21.48 66.86 21.48 66.86 ## "Peak Plus"
3 56 7.97 74.82 7.97 74.82
4 117 16.64 91.47 16.64 91.47
5 10 1.42 92.89 1.42 92.89
6 12 1.71 94.59 1.71 94.59
7 21 2.99 97.58 2.99 97.58
8 8 1.14 98.72 1.14 98.72
9 2 0.28 99.00 0.28 99.00
10 2 0.28 99.29 0.28 99.29
11 1 0.14 99.43 0.14 99.43
12 2 0.28 99.72 0.28 99.72
14 1 0.14 99.86 0.14 99.86
15 1 0.14 100.00 0.14 100.00 ## Oldest age
<NA> 0 0.00 100.00
Total 703 100.00 100.00 100.00 100.00 ## Sample size
One can already identify that the "Peak" criterion corresponds to the 1+ age class and thus "Peak Plus" to the 2+ age class. We can also observe quite a lot of variation from one age class to the next, suggesting that the data may be overdispersed. Note too that the 13+ age class has not been sampled. Finally, the oldest age that was sampled was 15+ (n = 1 old walleye) among the n = 703 sampled walleyes.
Using the following PeakPlus function to identify the mode, one can extract the "Peak Plus" age directly from the data and store it for later use in a R object that we will refer to as PP_saviBASK12:
PeakPlus <- function()
{
uniqv <- unique(saviBASK12$AGE)
Peak <- uniqv[which.max(tabulate(match(saviBASK12$AGE,uniqv)))]
Peak + 1
}
PP_saviBASK12 <- PeakPlus()
PP_saviBASK12
[1] 2
The oldest age, which is also required to analyze the descending limb of the catch curve, can be simply obtained with the max() function and we will also store the result in a R object for later use:
age_max_saviBASK12 <- max(na.omit(saviBASK12$AGE))
age_max_saviBASK12
[1] 15
Note that ideally, no individual with missing age information should be present in the dataset used for catch-curve analyses.
The na.omit() function above is used to prevent problems when trying to identify the oldest age in presence of NA in the dataset.
Instead of trying to manipulate the individual data in R or Excel to transform them into age-frequency data as they are required to perform catch-curve analyses, we will instead make use of the fishmethods package and its agesurv() function that will do just this automatically. By specifying the values to be used for "Peak Plus" with the argument full= and the oldest age with the argument last=, this function will perform catch-curve analyses for the estimation of Z and its SE according to several methods. Note that the total annual survival (S) estimate can also be directly estimated, but obtaining Z on the log scale is more analytically useful and this, especially for the estimation of the uncertainty when dealing with exponential families. See for instance the following relevant explanations provided by Gavin Simpson about this on his blog:
https://fromthebottomoftheheap.net/2018/12/10/confidence-intervals-for-glms/
Now that we are convinced to focus on Z, the results will be stored in a R object referred to below as FM_saviBASK12 (once again for later used):
library(fishmethods)
FM_saviBASK12 <- agesurv(
type = 1, ## indicates that individual data are being used
age = saviBASK12$AGE,
full = PP_saviBASK12, ## corresponds to the first age being considered, i.e. "Peak Plus"
last = age_max_saviBASK12, ## corresponds to the oldest age that was sampled
estimate = "z") ## we ask to only estimate Z
FM_saviBASK12
$results
Method Parameter Estimate SE
1 Linear Regression Z 0.405 0.050
2 Weighted Linear Regression Z 0.518 0.075
3 Heincke Z 0.500 0.026
4 Chapman-Robson Z 0.492 0.025
5 Chapman-Robson CB Z 0.491 0.077
6 Poisson Model Z 0.487 0.089
7 Random-Intercept Poisson Model Z NA NA ## not applicable to our case, but provided by default
$data
age number
3 2 151
4 3 56
5 4 117
6 5 10
7 6 12
8 7 21
9 8 8
10 9 2
11 10 2
12 11 1
13 12 2
14 14 1
15 15 1
You can already observe here that the LR method (i.e., Linear Regression) has yielded not surprisingly the lowest estimate of Z when compared to the other methods, very likely underestimating Z as discussed above. You can also observe that the estimates obtained from the different methods have all omitted the fact that the 13+ age class was not sampled, whereas a zero should have been considered for age = 13. Note that at this point, we can directly use FM_saviBASK12$data since these age-frequency data are now stored as a data frame.
Once the individual data have been rearranged in an age-frequency format, the chapmanRobson() function of the FSA package can also be used to estimate Z, either with the CR or CRCB method. Using the catchCurve() function of the same package will instead provide the estimations according to the LR and WRL methods (when the argument weight=TRUE is used in the latter case), obtaining identical estimates as those provided by fishmethods. This being said, let's recall here that the LR method is no longer recommended and that although the WLR method generally yields more accurate estimates, this method has been characterized as being "completely ad hoc" by Smith et al. (2012). In simulations of catch-curve analyses, Mainguy & Moral (2021) have shown that the WLR method underestimated Z in a marked manner under the most encountered scenario of overdispersed age-frequency data, af for the LR method discussed earlier, whereas the NB2 and mean-parameterized Conway-Maxwell-Poisson (CMP; Huang, 2017) that both have a dispersion parameter to model the "extra" variance did far better. Let's recall here that small differences on the log (link) scale may translate to large differences on the response scale. Below are the actual estimates obtained for the LR and WLR methods as opposed to the more adequate ones that were yielded by a GLM fitted with either the NB2 or CMP when Z was known (0.500). This "true" Z value should have led to A = (1 - exp(-0.500))*100 = 39.3% for the estimated total annual mortality (A). We will also add below the estimations of A from the CR method and GLM_Poisson (from the GB method in the latter case and thus, not the PM, as it closely resembles the CR method, Millar 2015) too and this, when we are only looking at the central tendency. The values of Z presented below were obtained from 100 random samples of the same known simulated population with Z = 0.500 from which the mean of all the estimations obtained was used (see Mainguy & Moral 2021).
LR: A = (1 - exp(-0.306)) * 100 = 26.4% ## Very poor due to an important underestimation bias.
WLR: A = (1 - exp(-0.359)) * 100 = 30.2% ## Poor due to a nonnegligible underestimation bias.
CR: A = (1 - exp(-0.442)) * 100 = 35.7% ## better than the previous two, but not quite there.
Poisson: A = (1 - exp(-0.429)) * 100 = 34.9% ## similar to the CR method, but actually slightly more underbiased.
NB2: A = (1 - exp(-0.485)) * 100 = 38.4% ## within 1 percentage point of that of the population being estimated. Pretty good.
CMP: A = (1 - exp(-0.475)) * 100 = 37.8% ## quite accurate. Good.
Take-home message regarding the Heincke, LR and WLR methods
We believe that it's about time to leave the Heincke, LR and WLR methods behind, knowing that if the count data are linearly over-, equi- or underdispersed, the WLR method would however generally better perform (Smith et al., 2012, Millar, 2015, Mainguy & Moral, 2021). Nevertheless, by being completely ad hoc, we think that the WLR method should no longer be recommended, as for the generally poor-performing LR and Heincke methods. Note too that the estimation of the variance was not considered above but would have only be adequately modelled from a statistical standpoint by the NB2 and CMP extensions. The CRCB method and PM would not be too far behind.
Inserting missing zeros in the age-frequency data
We will now make use of the tidyverse package (Wickham et al., 2019) and create a function called prepare_data to insert zero-count (missing) age class in the age-frequency dataset that was created with fishmethods (i.e., FM_saviBASK12$data). Note that if there are no missing age classes, using this function will leave the age-frequency data unchanged (e.g., if used in an automated script). As these data correspond to the descending limb of the catch curve, we will first rename this dataset as dlimb_saviBASK12 and apply the prepare_data function to this object. Then, this corrected dataset will be saved as a last step as saviBASK12_corr, thanks to the following R script that was developed by Rafael de Andrade Moral for this specific need (obrigado!):
library(tidyverse)
dlimb_saviBASK12 <- FM_saviBASK12$data
prepare_data <- function(data) {
full_range <- tibble(age = seq(min(data$age), max(data$age), 1))
new_dat <- full_join(data, full_range)
new_dat$number[is.na(new_dat$number)] <- 0
new_dat <- new_dat %>%
arrange(age)
return(new_dat)
}
saviBASK12_corr <- prepare_data(dlimb_saviBASK12)
saviBASK12_corr
age number
1 2 151
2 3 56
3 4 117
4 5 10
5 6 12
6 7 21
7 8 8
8 9 2
9 10 2
10 11 1
11 12 2
12 13 0 ## the missing age has been added with a zero count
13 14 1
14 15 1
Depending on how many missing age classes are found within a given dataset, the Z estimate and its SE will be affected to different extents. In all cases, no missing age class should be present in the age-frequency data, as not having sampled one or a few age classes is as much informative than the information obtained from age classes with counts ≥ 1.
********************************************************************************************************************************************************
If one would like to try the next R scripts with the saviBASK12_corr dataset (i.e., the descending limb of the catch curve that is analyzed for the estimation of Z), here is a way to reproduce it with n = 384 walleyes of known age distributed in 10 age classes, i.e. from age = 2 to 15, but no sampled walleye aged 13 years among these:
age <- seq(2, 15, by = 1)
number <- c(151, 56, 117, 10, 12, 21, 8, 2, 2, 1, 2, 0, 1, 1)
saviBASK12_corr <- data.frame(age, number)
********************************************************************************************************************************************************
Are the age-frequency data under-, equi- or overdispersed?
We will use a GLM_Poisson (.p) since it has no dispersion parameter to model the saviBASK12_corr dataset to then determine how the count data may be dispersed. Note that the estimate of Z from this model (i.e., absolute value of the age coefficient) as for its SE should be similar to those obtained from the CR method. Lastly, note also that the original response variable count of the saviBASK12 dataset has been replaced (renamed) by number from the previous manipulations that were performed with the fishmethods package to obtain the saviBASK12_corr dataset, so we will account for this in the script below:
summary(m_saviBASK12_corr_p <- glm(number ~ age, family = poisson, data = saviBASK12_corr))
Call:
glm(formula = number ~ age, family = poisson, data = saviBASK12_corr)
Deviance Residuals:
Min 1Q Median 3Q Max
-4.9006 -1.4054 -0.2980 0.7449 7.1096
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 5.97531 0.10546 56.66 <2e-16 ***
age -0.48776 0.02581 -18.90 <2e-16 *** ## Z = 0.488 ± 0.026
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 812.26 on 13 degrees of freedom
Residual deviance: 105.89 on 12 degrees of freedom
AIC: 160.38
Number of Fisher Scoring iterations: 5
The GLMPoisson estimates Z = 0.488 and its SE = 0.026, whereas previously the CR method using the uncorrected age-frequency data estimated Z = 0.492 and its SE = 0.025, so we can indeed tell that the estimations are pretty close, although not identical. To be statistically adequate, these estimates have to be obtained from equidispersed data, as the Poisson family distribution requires that the variance is approximatively equal to the mean to provide reliable estimates. Let's use the dispersiontest() function of the AER package (Kleiber & Zeileis, 2008) with the argument alternative="greater" to specifically test whether the counts analyzed with our GLM_Poisson may be overdispersed based on the dispersion test that was proposed by Cameron & Trivedi (1990):
library(AER)
dispersiontest(m_saviBASK12_corr_p, alternative = "greater")
Overdispersion test
data: m.saviBASK12_corr.p
z = 1.468, p-value = 0.07106
alternative hypothesis: true dispersion is greater than 1
sample estimates:
dispersion
7.876445
The estimated dispersion parameter here of 7.88 should be close to 1 for a GLM_Poisson to be adequate. So we know already that it is very unlikely to be adequate. The p-value related to this overdispersion test (which should just be used as "being indicative of") is 0.07, so we know that the central estimate (i.e., Z) may be somehow correct, but that the SE is very likely underestimated. Using a downwardly biased SE estimate for statistical inferences is not only not recommended, but should be avoided at all costs. This is because one may end up finding statistically-supported differences when they actually do not exist. Nelson (2019) has cautioned against this statistical situation for catch-curve analyses.
Another available dispersion test can also be applied when the data are suspected of being overdispersed from the performance package with its check_overdispersion() function according to the overdispersion test that was proposed by Gelman & Hill (2007):
library(performance)
check_overdispersion(m_saviBASK12_corr_p)
# Overdispersion test
dispersion ratio = 9.420
Pearson's Chi-Squared = 113.043
p-value = < 0.001
Overdispersion detected.
The age-frequency data are clearly overdispersed based on this alternative dispersion test.
A third and last overdispersion test that we will consider can be performed based on simulations from the investigated model residuals with the DHARMa package (Hartig 2022) and its testDispersion() function together with the argument alternative="greater":
library(DHARMa)
simulationOutput <- simulateResiduals(fittedModel = m_saviBASK12_corr_p)
testDispersion(simulationOutput, alternative = "greater", plot = FALSE)
DHARMa nonparametric dispersion test via sd of residuals fitted vs. simulated
data: simulationOutput
dispersion = 16.252, p-value < 2.2e-16
alternative hypothesis: greater
The age-frequency data are also clearly overdispersed based on this last alternative dispersion test.
When looking at the first results obtained with fishmethods, the CRCB method and the PM both estimated similar Z values, but their SE were (correctly so) much larger than that of the CR method and thus, of that also of the GLMPoisson. This is because the CRCB method and the PM as previously discussed both adjust the estimated SE with a correction factor to account for the overdispersion that is clearly present in the data. The LR and WLR methods are pretty much "clueless" about this statistical reality that may actually jeopardize the parametric test assumption they rely on regarding homoscedasticity (i.e., homogeneous variance). This is because we now know that the variance increases as a function of the mean and likely in a quadratic manner.
Prior to finally analyze the corrected age-frequency data with different extensions of the Poisson distribution family according to the GB method of Mainguy & Moral (2021), we will apply as a last data preparation step a relevant recommendention for catch-curve analyses that was made by Russell B. Millar which consists in extending the dataset with zero-count age classes past the oldest age of the considered sample by as much as 2 to 3 times this age (Millar, 2015).
Why should we do such a thing?
Individuals older than the maximal age found in your fish sample probably exist. They actually often do, but by being in general much less frequent than younger age classes, very old specimens are less likely to end up being captured by the fishing gear, especially when the sampling effort is not extensive - although this is less applicable when dealing with short-lived species. By adding many zero-count age classes past the oldest age from your sample, you are analytically accounting for this biological reality and therefore, minimizing the underestimation of A in your catch-curve analysis (see Millar 2015). Let's keep in mind here that we are not adding any new counts in the dataset. We are instead "forcing" the GLM to account for the many zeros at the end, which will slightly modify the Z estimate (i.e., slope) in an attempt to account for the likely presence of old fish that would have otherwise not be accounted for, despite existing in the sampled population. We personally think that it's a brilliant idea. Nelson (2019) has included this methodological consideration in its PM and so did Mainguy & Moral (2021) with their GB method.
Extending the dataset (_corr) with zero-count age classes (_ext) following the recommendation of Millar (2015) can be achieved this way if one extend the dataset with zero-count age classes two times past the oldest age:
saviBASK12_ext = rbind(saviBASK12_corr,
cbind(age = (age_max_saviBASK12 + 1):(2 * age_max_saviBASK12), number = rep(0, age_max_saviBASK12)))
To make sure that the just created extended dataset includes the last added age classes up to 30 (oldest age = 15 x 2), we can use the headtail() function of the FSA package to visualize the first and last three lines:
library(FSA)
headtail(saviBASK12_ext)
age number
1 2 151
2 3 56
3 4 117
27 28 0
28 29 0
29 30 0
We will now analyze the extended age-frequency dataset (_ext) according to the Poisson distribution family and four of its extensions:
Negative binomial type II (NB2)
Negative binomial type I (NB1) ## similar to family=quasipoisson, but allows to compare a model with other candidate models within a maximum likelihood framework (e.g., use of AICc)
Mean-parameterized Conway-Maxwell-Poisson (CMP)
Generalized Poisson (GP)
Note that several other extensions of the Poisson distribution family exist, including for instance the double Poisson (Efron 1986) and the hyper-Poisson (Sáez-Castillo & Conde-Sánchez, 2013). Basically, it all comes down to a compromise between testing different extensions on the one hand that may better fit the observed count data and, on the other, the required analytical time for each additional extension being considered. We believe that the four extensions listed above cover pretty much the range of situations regarding dispersion that are encountered with ecological count data, but the choice in the end is up to the fisheries biologist that is performing the catch-curve analyses.
When the count data being analyzed are overdispersed, as it is the case here, all the four extensions listed above can be considered and actually the Poisson distribution family too sometimes if the data are only slighlty overdispersed. Oppositely, in the presence of underdispersed data, only the CMP and GP should be considered together with the Poisson, as the NB2 and NB1 were specifically designed to model any "extra" variance. When the data are not overdispersed, the NB2 and NB1 can either behave as the Poisson by sometimes providing identical predictions but will however be further penalized under an information-theoretic approach due to their unnecessary additional dispersion parameter. Using these same two extensions, especially with underdispersed data, may also simply lead to model convergence issues. The recommendation is thus to not use these two extensions (nor the quasi-Poisson) when the data are underdispersed, i.e. when the dispersiontest() of the AER package provides a dispersion value < 1. The CMP and GP extensions have a clear advantage over the NB2 and NB1 ones by being able to model any type of dispersion in ecological studies (e.g., Brooks et al., 2019). In the end, obtaining adequate models from which the predicted values offer a good fit to the observed data is the ultimate objective, making use of the most-supported ones amongst those found to be sufficiently adequate based on their AICc score or another information criterion.
For the original saviBASK12 dataset, we know that the counts are overdispersed and thus, that the GLM_Poisson won't produce adequate estimations of Z and more especially so about its SE. The same therefore holds true for the CR method. To verify this, let's now test the same considered GLM_Poisson when using the extended data and assess whether it is sufficiently adequate to describe the rate at which counts (now number) decrease with age to estimate Z and then A.
summary(m_saviBASK12_ext_p <- glm(number ~ age, family = poisson, data = saviBASK12_ext))
Call:
glm(formula = number ~ age, family = poisson, data = saviBASK12_ext)
Deviance Residuals:
Min 1Q Median 3Q Max
-4.8692 -0.5536 -0.0991 -0.0177 7.1321
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 5.98751 0.10397 57.59 <2e-16 ***
age -0.49146 0.02533 -19.40 <2e-16 *** ## Z = 0.491 ± 0.025
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 1371.5 on 28 degrees of freedom
Residual deviance: 106.7 on 27 degrees of freedom
AIC: 161.19
Number of Fisher Scoring iterations: 5
Z (0.491) has slightly changed as expected following the addition of zero-count age classes and is actually now closer to the estimates obtained from the CR method on the original dataset. Based on the (not extended) saviBASK12_corr dataset, A would have been estimated as 38.6%, whereas with the saviBASK12_ext, it slightly increases to 38.8%. The underestimation is thus negligible in this case, but 38.8% is likely closer to the "truth" than 38.6%. Depending on the data at hand, such difference may be larger or weaker. We therefore think that it is thus good practice to just use upfront the recommendation of Millar (2015) for catch-curve analyses.
Assessing model adequacy (i.e., goodness-of-fit)
So, to answer the question: "are these parameter estimates derived from a sufficiently adequate GLMPoisson to be reliably used for statistical inferences about mortality rates?", one can use the hnp() function of the hnp package of Moral et al. (2017) as it allows to determine whether the distributional assumptions of the Poisson distribution family are respected for the parameter estimates that were produced by the model. Without going into the details, this adequacy assessment approach is based on half-normal plots (hnp) and quantifies how the Pearson residuals behave given the distribution family that was used for the modelling of the count data. A "simulated envelope" is produced by this package that should include about 95% of the residuals from the investigated model to be judged as being adequate. As such, one can expect to have about 5% of the residuals found outside the simulated envelope for a "good" model. The less residuals outside the simulated envelope, the better. Because the lower and upper envelope limits are simulated, 10 to 100 hnp simulations should ideally be used to obtain a mean percentage of residuals found outside the envelope to more properly characterize the fit. Mainguy & Moral (2021) have proposed the following categorization:
excellent: mean % < 1%
good: 1% ≤ mean % < 5%
acceptable: 5% ≤ mean % < 10%
inadequate: mean % ≥ 10%
To further understand the principle behind hnp, here is a single simulation testing the GLMPoisson that we have just run above and that we know is inadequate given the overdispersion of the counts:
library(hnp)
hnp(m_saviBASK12_ext_p, resid.type = "pearson", how.many.out = TRUE, paint = TRUE)
Poisson model
Total points: 29
Points out of envelope: 10 ( 34.48 %)
*** Note that the number and thus percentage of Points out of envelope will change from one hnp diagnostic to the other as it relies on empirical simulations.
The following graph will also be simultaneously produced:
Clearly, many of the residuals are found outside of the envelope (indicated by red empty circles) and moreover, are moving away from the median (dashed line) which reflects where the residuals should be aligned in a "perfect" situation given the distributional assumptions of the investigated model. From this sole simulation, 10 out of 29 residuals are found outside the envelope = 38.48% and are not randomly distributed as they should along the median. As you may have already guessed, the observed tendency to move away from the median is caused by the overdispersion since the Poisson distribution family was not designed to handle such often-encountered situation with ecological count data. The GLMPoisson is thus clearly inadequate and so is the CR method. When the percentage of residuals found outside the envelope is in the 8-12% range, we recommend to run 10 and even 100 hnp simulations such that we can make sure that the mean percentage is either below or above the suggested arbitrary 10% threshold, knowing however that we are dealing with an acceptable model at best if mean = 9.7% for instance and thus, not an ideal one when regarding adequacy.
Here is how to run 10 hnp simulations without requesting that a plot is produced each time, using set.seed(2023) just for the reproducibility of the results (if desired):
library(hnp)
set.seed(2023)
hnp <- list()
for(i in 1:10) {
hnp[[i]] <- hnp(m_saviBASK12_ext_p, resid.type = "pearson",
how.many.out = TRUE, plot.sim = FALSE)
}
summary_hnp <- sapply(hnp, function(x) x$out/x$total * 100)
mean(summary_hnp)
[1] 24.82759
Using more simulations simply brings more confidence in the hnp result (i.e., mean %) which has changed from 34.5% with 1 hnp simulation to 24.8% with 10 hnp simulations. Using 100 hnp simulations would have led to 27.7% with the same seed (2023). In all cases, the GLMPoisson remains inadequate and should thus be discarded. We are now left with four of its extensions: NB2, NB1, CMP and GP to estimate Z.
The hnp package can directly handle a GLMPoisson and GLMNB2 (and others such as family=gaussian and even lme4 objects, i.e. mixed-effect linear models), but cannot handle the NB1, CMP or GP directly without recoursing to additional "helper" functions. The Appendix of Mainguy & Moral (2021) and the information/scripts below describe how to assess their respective adequacy with such helper functions. To optimize analytical time, the GLMPoisson adequacy should be analyzed with hnp based on the model output obtained from the the glm() function of the stats package, whereas the GLM fitted with the NB2 extension should be analyzed in hnp according to the output obtained with the glm.nb() function of the MASS package (Venables & Ripley, 2002). However, these two models should also be analyzed with the glmmTMB() function of the glmmTMB package (Brooks et al., 2017) to later allow an easier model selection procedure within the MuMIn package (Bartoń, 2023) and its model.sel() function when comparing them with the results of the NB1, CMP and GP extensions. It's facultative but facilitates in our opinion the process regarding model selection when using the MuMIn package.
If we pretend that the GLMPoisson would have not been discarded by being adequate (although it is not), here are all the models that would need to be run. Note that the model is followed by .nb2 for glm.nb() and .NB2 for glmmTMB() with family=nbinom2 such that we can differentiate them, as for the Poisson case too (.p vs. .P):
GLMPoisson with glm() ## has already been run above
summary(m_saviBASK12_ext_p <- glm(
number ~ age, family = poisson, data = saviBASK12_ext))
GLMNB2 with glm.nb()
library(MASS)
summary(m_saviBASK12_ext_nb2 <- glm.nb(
number ~ age, data = saviBASK12_ext))
All GLMs with glmmTMB()
library(glmmTMB)
summary(m_saviBASK12_ext_P <- glmmTMB(
number ~ age, family = poisson, data = saviBASK12_ext))
summary(m_saviBASK12_ext_NB2 <- glmmTMB(
number ~ age, family = nbinom2, data = saviBASK12_ext))
summary(m_saviBASK12_ext_NB1 <- glmmTMB(
number ~ age, family = nbinom1, data = saviBASK12_ext))
summary(m_saviBASK12_ext_CMP <- glmmTMB(
number ~ age, family = compois, data = saviBASK12_ext))
summary(m_saviBASK12_ext_GP <- glmmTMB(
number ~ age, family = genpois, data = saviBASK12_ext))
A "modelling note" with glmmTMB
For the NB1 but more specifically the CMP and GP extensions, it is not unusual to obtain one or more warning messages at the end of the model output indicating "NA/NaN function evaluation". This appears to occur "[...] when the optimizer visits a region of parameter space that is invalid" but "It is not a problem as long as the optimizer has left that region of parameter space upon convergence, which is indicated by an absence of the model convergence warnings". More details can be found at:
https://cran.r-project.org/web/packages/glmmTMB/vignettes/troubleshooting.html
Briefly, if no warning about model convergence is produced by glmmTMB, your model has likely correctly converged despite its "erratic" behaviour to find the best estimate through the maximum likelihood approach from which the warnings discussed above were generated. If one wish to make sure that a given log-linear model has correctly converged, this can be achieved as follow for a glmmTMB object, here using the model fitted with the CMP extension as an example, although note that in our example, none of the NB1, CMP nor GP models had such warnings about "NA/NaN function evaluation":
m_saviBASK12_ext_CMP$sdr$pdHess
[1] TRUE
Otherwise, one can us $converged instead of $sdr$pdHess for a log-linear model obtained with glm() or glm.nb(). It is good practice to check for model convergence, as well as to make sure that the estimated parameter SE are not too (infinitely) large by examining the model output. For simple models analyzed with the glm() and glm.nb() functions, fewer than 25 Fisher scoring iterations (maximum by default) are generally required, so the limit should ideally not be reached when analyzing a single, fixed effect in such a simple linear regression. On the other hand, glmmTMB can accommodate much more complex models with random factors and zero-inflated terms and as result, the optimizer used by default allows to use more iterations (i.e., up to 300) to reach convergence. If this limit is reached, it can be increased, but doing so is unlikely to change the outcome for the simple case of analyzing the descending limb of a catch curve. Note too that when one or more "NA/NaN function evaluation" warning message occurs (for the CMP extension in particular), this may translate in someone not being able to use the hnp package to assess model adequacy. If running just 1 hnp simulation takes a considerable amount of time up to a point where R is no longer responding (i.e., "freezing"), then it is simply better to accept that its fit cannot be assessed. The safest option then may be to simply reject this model, even if it may be assigned as the top-ranking model under an information-theoretic approach.
If we come back to the NB2 extension analyzed with glm.nb(), we can directly assess the fit of the resulting m.saviBASK12_ext.nb2 as follow, here with 10 hnp runs:
library(hnp)
set.seed(2023)
hnp <- list()
for(i in 1:10) {
hnp[[i]] <- hnp(m_saviBASK12_ext_nb2, resid.type = "pearson",
how.many.out = TRUE, plot.sim = FALSE)
}
summary_hnp <- sapply(hnp,function(x) x$out/x$total * 100)
mean(summary_hnp)
[1] 0
Difficult to be more adequate than this: 100% of the residuals are where they should be according to the distributional assumptions of the error structure being used (NB2) from 10 consecutive hnp simulations.
Now, as the adequacy assessment of a glmmTMB object is much more computationally intensive, as an "analytical strategy" to optimize computational time we will first compare these five models according to their respective second-order Akaike weights. By doing so, we can already identify the candidate(s) model(s) that would be rejected anyway from having none to little statistical support under an information-theoretic approach. When this occurs for some less-supported models, it is consequently not required to assess their adequacy and thus, some analytical time is saved. But we will however need to make sure that the ones retained are adequate and in some rare instances, reconsider the adequate ones that were initially discarded.
Let's now determine how each of the models are statistically supported based on their respective AICc scores that is used as default ranking option by MuMIn. We already know that the GLMPoisson is inadequate, whereas the GLMNB2 exhibits excellent adequacy, but let's pretend for now that they are all adequate to identify the one getting most of the second-order Akaike weight, and those that don't receive any or very little based on all the models tested with the glmmTMB() function only:
library(MuMIn)
model.sel(
m_saviBASK12_ext_P,
m_saviBASK12_ext_NB2,
m_saviBASK12_ext_NB1,
m_saviBASK12_ext_CMP,
m_saviBASK12_ext_GP)
Model selection table
cnd((Int)) dsp((Int)) cnd(age) family df logLik AICc delta weight
m_saviBASK12_ext_NB2 5.989 + -0.4949 n2(lg) 3 -41.212 89.4 0.00 0.754
m_saviBASK12_ext_CMP 5.973 + -0.4894 cm(lg) 3 -42.405 91.8 2.39 0.229
m_saviBASK12_ext_GP 5.591 + -0.3876 gn(lg) 3 -45.184 97.3 7.94 0.014
m_saviBASK12_ext_NB1 5.632 + -0.3977 n1(lg) 3 -46.818 100.6 11.21 0.003
m_saviBASK12_ext_P 5.988 + -0.4915 ps(lg) 2 -78.595 161.7 72.27 0.000
The models shown in red ***can*** be discarded, as if delta = 7, this would already tell us that the considered model as an evidence ratio of 33.1, i.e. indicating that it is about 33 times less likely than the top-ranking (NB2) model given the count data being modelled. This explains why the GLMGP with delta = 7.94 only receives 1.4% of the second-order Akaike weight and the GLMNB1 even less with delta = 11.21. The GLMPoisson is by far the worst, being > 1 G times less likely than the NB2 model. Let's recall here that the GLMPoisson is often equivalent to the CR method, so the latter is likely as much poorly supported. One could sometimes use an even stricter threshold, i.e. delta < 4 as it corresponds to an evidence ratio of 7.4 and by being about > 7 times less likely, we prefer to discard such candidate models. See Burnham et al. (2011) for a detailed description of the evidence ratio concept based on the AICc and other analytical considerations when using an information-theoretic approach. However, it is possible to keep all models to later perform model averaging, knowing that those with a second-order Akaike weight of zero (0) will actually have no influence on the averaged parameter estimates, whereas those exhibiting negligible to little weight will only slightly influence the resulting compromise.
Now let's assess the adequacy of the GLMCMP with the dfun, sfun and ffun helper functions, knowing that if the NB1 or GP extentions would have also needed to be tested, it would only require some slight modifications to the scripts below (see Mainguy & Moral, 2021 for details):
dfun <- function(obj) {residuals(obj, type = "pearson")}
sfun <- function(n,obj) {simulate(obj)[[1]]}
ffun <- function(response) {
fit <- try(glmmTMB(response ~ age,
family = compois, data = saviBASK12_ext),
silent = TRUE)
while(class(fit)=="try-error") {
response2 <- sfun(1, m_saviBASK12_ext_CMP)
fit <- try(glmmTMB(response2 ~ age, family = compois, data = saviBASK12_ext),
silent = TRUE)
}
return(fit)
}
library(hnp)
set.seed(2023)
hnp <- list()
for(i in 1:10) {
hnp[[i]] <- hnp(m_saviBASK12_ext_CMP,
newclass = TRUE, diagfun = dfun, simfun = sfun, fitfun = ffun,
how.many.out = TRUE, plot.sim = FALSE)
}
summary_hnp <- sapply(hnp, function(x) x$out/x$total * 100)
mean(summary_hnp)
[1] 0.3448276
These 10 hnp simulations, which have each required that 99 simulations were performed each time through the helper functions, have taken > 40 minutes to be completed on an conventional laptop. As a comparison, the same number of hnp simulations would have only taken between 1 and 2 minutes in most cases for the Poisson and NB2 as obtained from the glm() or glm.nb() functions, respectively. But in the end, this result tells us that the GLMCMP is sufficiently adequate to be used for the estimation of Z. Know that CMP is probably the most computationally-intensive extension to assess for adequacy, as it actually approximate the mean from the mode that was originally used for this extension (Sellers & Shmueli, 2010) and is therefore analytically "less stable", although being quite useful too. To further reduce the required computational time, another strategy could be to run just 1 hnp simulation for the NB1, CMP or GP extensions. Doing so here provided a value of 0 for CMP. As this did not occur by chance, presuming that the CMP extension is sufficiently adequate from this single simulation is reasonable, as it won't change from 0% to 20% or 82% from one simulation to the another. But if the result is 9.1% or 11.8%, it is preferable to stay on the safe side by running at least 10 if not 100 hnp simulations sequentially by letting a computer run overnight and get the results the following morning - unless the CMP crashes and causes R to freeze, which unfortunately sometimes happens.
As a last step, we will again use the MuMIn package and its model.sel() function to calculate the second-order Akaike weight of the two retained models in a first time to then use a model averaging approach (Dormann et al., 2018) to obtain a weighted compromise between the NB2 and CMP models. For this, we will store in a R object, referred as out_saviBASK12_ext, the output that contains the model selection table obtained below:
library(MuMIn)
out_saviBASK12_ext <- model.sel(
m_saviBASK12_ext_NB2,
m_saviBASK12_ext_CMP)
out_saviBASK12_ext
Model selection table
cnd((Int)) dsp((Int)) cnd(age) family df logLik AICc delta weight
m_saviBASK12_ext_NB2 5.989 + -0.4949 n2(lg) 3 -41.212 89.4 0.00 0.767
m_saviBASK12_ext_CMP 5.973 + -0.4894 cm(lg) 3 -42.405 91.8 2.39 0.233
Abbreviations:
family: cm(lg) = ‘compois(log)’, n2(lg) = ‘nbinom2(log)’
Models ranked by AICc(x)
We can then generate a compromise (referred below as comp) by applying the model.avg() function to the out_saviBASK12_ext object, which will be constituted of 76.7% of the NB2 estimates and 23.3% of the CMP estimates, noting here that the estimated Z is slightly higher for NB2 than CMP, so this trend will remain but will be reduced through model averaging:
library(MuMIn)
comp_saviBASK12_ext <- model.avg(out_saviBASK12_ext, revised.var = TRUE)
comp_saviBASK12_ext
Call:
model.avg(object = out_saviBASK12_ext, revised.var = TRUE)
Component models:
‘1a’ ‘1b’
Coefficients:
cond((Int)) cond(age)
full 5.985322 -0.4936001
subset 5.985322 -0.4936001
We have our compromise: Z = 0.494, not forgetting that it's the absolute value of the age coefficient.
Now here is a series of scripts to "extract" Z from this compromise, find the Lower Limit (LL) and Upper Limit (UL) of its 95% CI and then back-transform these values at the response scale (i.e., A):
Z_saviBASK12_ext <- abs(coef(comp_saviBASK12_ext, full = TRUE)[[2]])
Z_saviBASK12_ext
[1] 0.4936001
CI_Z_saviBASK12_ext<-confint(comp_saviBASK12_ext, full = TRUE)
CI_Z_saviBASK12_ext
2.5 % 97.5 %
cond((Int)) 5.2087631 6.7618812
cond(age) -0.6064064 -0.3807939
LL_Z_saviBASK12_ext <- abs(CI_Z_saviBASK12_ext[[4]])
LL_Z_saviBASK12_ext
[1] 0.3807939
UL_Z_saviBASK12_ext <- abs(CI_Z_saviBASK12_ext[[2]])
UL_Z_saviBASK12_ext
[1] 0.6064064
A_saviBASK12_ext <- (1 - exp(-Z_saviBASK12_ext)) * 100
A_saviBASK12_ext
[1] 38.95752
LL_A_saviBASK12_ext <- (1 - exp(-LL_Z_saviBASK12_ext)) * 100
LL_A_saviBASK12_ext
[1] 31.66813
UL_A_saviBASK12_ext <- (1 - exp(-UL_Z_saviBASK12_ext)) * 100
UL_A_saviBASK12_ext
[1] 45.4693
So, in the end, our "best" estimation of A is 39.0% with a 95% CI of [31.7, 45.5]. Our level of uncertainty is quite important, but we nonetheless have now a decent idea of the most likely total annual mortality (A) estimate given the data at hand. This value actually falls pretty much within the ones observed for other walleye populations in the province of Québec, being slightly more on the high side.
Just to visualize the fit of our compromise to the observed count data on the response scale, we can proceed with the following additional scripts to obtain the predicted values (pred) and their associate 95% CI based on normal theory at the log (link) scale:
new_data <- data.frame(
age = seq(2, 15, by = 0.1)) ## by=0.1 to obtain a smoothed regression curve
fitted <- predict(comp_saviBASK12_ext, new_data, type = "link", se.fit = TRUE)
pred <- exp(fitted$fit)
LL_pred <- exp(fitted$fit -1.96 * fitted$se.fit)
UL_pred <- exp(fitted$fit +1.96 * fitted$se.fit)
results <- cbind(new_data, pred, LL_pred, UL_pred)
The results object will look like this (three first and last lines for a check):
library(FSA)
headtail(results)
age pred LL_pred UL_pred
1 2.0 148.1346831 84.20538270 260.5995440
2 2.1 141.0002786 80.79790013 246.0593473
3 2.2 134.2094785 77.52111833 232.3519644
129 14.8 0.2671460 0.09615677 0.7421941
130 14.9 0.2542798 0.09061256 0.7135680
131 15.0 0.2420333 0.08538678 0.6860560
We can then produce a figure in which the observed counts that decrease according to age can be compared to the predicted values of our compromise between the GLMNB2 and GLMCMP, the only two top-ranking, adequate models that were retained for statistical inferences:
In the figure above, and as discussed at the beginning of this webpage, the red empty square is the 0+ age which is only partially sampled by the fishing gear and thus, not accounted for in the catch-curve analysis. The green empty triangle is the highest number of walleyes sampled in the 1+ year class (i.e., mode), thus representing the "Peak" which is also discarded to rather start the analysis according to the "Peak Plus" criterion, i.e. at the 2+ age class. So, the descending limb of the catch curve only includes the age classes going from 2 to 15 years that are represented by the blue filled circles to which extended zero-count age classes were added (not shown). The predicted values (black solid line) with the lower and upper limits (black dashed lines) were obtained from the comprise between the GLMNB2 and GLMCMP to model the descending limb from which Z was estimated on the log scale. The fit is not perfect, but one can tell that it's actually quite good at capturing the signal in the observed data. Note that by extrapolling the regression curve to the left, the predicted value would be not too far from the observed count for age = 1, suggesting that using the "Peak" criterion instead of the "Peak Plus" one could have yielded similar results.
As a last step, one can determine the approximated percentage of variation explained by the the top-ranking NB2 model by calculating the deviance explained (D 2) as described in Guisan & Zimmermann (2000) but with a rearranged equation as follow:
D2 <- 100 * (1 - m_saviBASK12_ext_nb2$deviance / m_saviBASK12_ext_nb2$null.deviance)
D2
[1] 95.61718
This is pretty good explanatory power based on the (unadjusted) D 2, but one must keep in mind that this high value was computed for the model analyzing the artificially-extended dataset with zero-count age classes. As such, the explanatory power is likely not as good as if the analysis would have been limited to the "real" dataset (i.e., saviBASK12_corr). Nevertheless, we will calculate the adjusted version (D2_adj) that accounts for both the sample size (n) and the number of parameters used (k) as proposed by Weisberg (1980) to evaluate the "calibration" performance of our NB2 model. The adjusted D2 is an analogue to the adjusted R 2 and should especially be used if one would like to compare with this metric a given model with other candidate models that are also trying to explain the same data (Guisan & Zimmermann, 2000). As such, the adjustment is made to minimize the risks overfitting, as for the R 2 (see Babyak, 2004):
D2_adj <- 100 - (((n - 1) / (n - k)) * (100 - D2))
The sample size (n) is the one associated to the corrected and extended descending limb of the catch curve (i.e., saviBASK12_ext) and we can get n by using the Summarize() function of the FSA package when looking for instance at the overall mean (~1):
library(FSA)
Summarize(number ~ 1, data = saviBASK12_ext)
n mean sd min Q1 median Q3 max percZero
29.00000 13.24138 35.50921 0.00000 0.00000 0.00000 2.00000 151.00000 55.17241
And we know that our GLMNB2 will produce a coefficient for the intercept, the age effect of interest while also using an additional dispersion parameter to model the "extra" variance. As such, the total number of parameters for our model is k = 3. As this corresponds to the number of degrees of freedom (df) of our model, we can use the logLik() function to confirm the number of df (or k):
logLik(m_saviBASK12_ext_nb2)
'log Lik.' -41.21179 (df=3)
Finally, D 2_adj is calculated as:
D2_adj<-100-(((29-1)/(29-3))*(100-D2))
D2_adj
[1] 95.28004
The adjustment has (very) slightly reduced the approximated variation explained in this case when accounting for both n and k.
An alternative way to assess the strength of the association between the predictions from the retained model(s) based on the extended data (saviBASK12_ext) is to apply the predict() function on the unextended (i.e., "real") observed data that have been corrected (saviBASK12_corr) in a first step. Then, the predicted values (fitted) are compared to the observed values with the non-parametric Spearman correlation test to estimate rho. Following the proposed categorization of Hinkle et al. (2003):
|rho|
0.90–1.00 Very strong correlation
0.70–0.89 Strong correlation
0.50–0.69 Moderate correlation
0.30–0.49 Low correlation
0.00–0.29 None to negligible correlation
One can then characterize the strength of the correlation:
fitted <- predict(m_saviBASK12_ext_NB2, newdata = saviBASK12_corr, type = "response")
cor_fit_obs <- cbind(saviBASK12_corr, fitted)
cor.test(cor_fit_obs$number, cor_fit_obs$fitted, method = "spearman", exact = FALSE)
Spearman's rank correlation rho
data: cor_fit_obs$number and cor_fit_obs$fitted
S = 28.232, p-value = 7.201e-07
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.9379526
The relationship is characterized as being very strong (rho = 0.94) with good statistical support (p < 0.001). We can be confident that our NB2 model generates predictions that exhibit a good calibration with the observed data it attempts to describe. The second-ranking CMP model obtains an indentical value for rho. In the end, the two retained NB2 and CMP models are sufficiently adequate based on the hnp package and the predicted values exhibit a strong statistical association with the observed data. This brings more confidence in our statistical inferences about this pivotal life-history trait.
Walleye (Sander vitreus). Photo credit: Hugo Mercille, MELCCFP
Assessing whether a temporal trend can be statistically detected regarding total annual mortality (A) based on consecutive annual samples from a same population
Miranda & Bettoli (2007) have proposed to test how the slopes of the descending limb of different fish samples may be parallel (i.e., stability) or instead may intersect (i.e., variation) to determine whether statistical support may be found for a difference in mortality rates. An illustration of this great idea is presented on the cover of the book of Ogle (2016) where the slopes intersect and thus, the mortality rates statistically differ between the two groups considered.
We will apply here the same principle but this time according to the GB method of Mainguy & Moral (2021) and for the specific case when the independent variable (YEAR) is continuous to model a temporal trend. See Mainguy & Moral (2021) for two empirical examples regarding the comparison of two groups (i.e., from using a categorical variable).
In this example, we will model how A may have changed over time since the implementation of a length-based regulation (i.e., harvest slot) at the start of the fishing season 2011 at one of the study areas being monitored in the St. Lawrence River, Québec, Canada. The annual surveys used in this example were conducted in 2011, 2013, 2016 and 2019. Here are the summary statistics regarding individual ages for the sampled walleyes in each of these four annual survey:
YEAR n mean sd min Q1 median Q3 max
2011 501 3.201597 1.855068 1 2 3 4.00 14
2013 117 2.982906 1.805083 1 2 3 4.00 8
2016 127 3.078740 2.169801 1 1 2 4.00 9
2019 130 4.138462 2.752740 1 3 4 4.75 14
Survey-specific analyses were separately performed according to the methodological details provided above, i.e. in an identical manner as for the analysis of the saviBASK12 dataset. These independent analyses have allowed to estimate A* and its 95% CI as follow, indicating also how many walleyes were included in the descending limb of the catch curve when compared to the overall sample size:
* in this example, the zero-count age classes were extended 3 times past the oldest age, as opposed to 2 times in the previous saviBASK12 example.
2011: 41.7% [36.4, 46.5] based on 156 walleyes (out of n = 501)
2013: 41.1% [26.3, 52.9] based on 30 walleyes (out of n = 117)
2016: 34.5% [23.2, 44.1] based on 85 walleyes (out of n = 127)
2019: 26.8% [14.6, 37.3] based on 33 walleyes (out of n = 130)
From these annual survey estimates, one can already guess that A has declined over time - which is actually one of the desired effects from implementing a harvest slot. However, given the large uncertainty around these estimates, especially for the surveys that were conducted in 2013 and 2019 given their small sample sizes for the analysis of the descending limb, statistically quantifying such potential decline can be achieved with the same rationale. We will regroup below those 4 datasets into a single one and then contrast two alternative models:
(1) the additive (ADD) effects of age and year:
number ~ age + year
(2) the multiplicative effects (i.e., interaction = INT) of age and year:
number ~ age * year [same as: number ~ age + year + age*year]
Each of these two alternative models will be fitted with the Poisson family distribution and the same four extensions that were previously tested (NB2, NB1, CMP and GP), yielding a total of 10 candidate models to analyze the temporal trend we're trying to statistically describe.
Here are the corrected and extended age-frequency data that were obtained for each these annual surveys following the same steps that were provided at first for the saviBASK12 dataset. The annual age-frequency data are only provided below to be reproduced such that one can try the scripts described further below to model the temporal trend based on these 4 sampled years over a rather short 9-year period.
2011
year <- 2011
age_corr <- seq(4, 14, by = 1)
number_corr <-c(61, 39, 23, 14, 13, 3, 1, 0, 1, 0, 1)
savi_2011_corr <- data.frame(year, age_corr, number_corr)
age_ext <- seq(4, 42, by = 1)
number_ext <- c(
61, 39, 23, 14, 13, 3, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
savi_2011_ext <- data.frame(year, age_ext, number_ext)
2013
year <- 2013
age_corr <- seq(4, 8, by = 1)
number_corr <- c(10, 9, 3, 3, 5)
savi_2013_corr <- data.frame(year, age_corr, number_corr)
age_ext <- seq(4, 24, by = 1)
number_ext <- c(10, 9, 3, 3, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
savi_2013_ext <- data.frame(year, age_ext, number_ext)
2016
year <- 2016
age_corr <- seq(2, 9, by = 1)
number_corr <- c(24, 11, 24, 5, 10, 2, 8, 1)
savi_2016_corr <- data.frame(year,age_corr,number_corr)
age_ext <- seq(2, 27, by = 1)
number_ext <- c(24, 11, 24, 5, 10, 2, 8, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
savi_2016_ext <- data.frame(year, age_ext, number_ext)
2019
year <- 2019
age_corr <- seq(5, 14, by = 1)
number_corr <- c(9, 2, 7, 0, 7, 2, 1, 4, 0, 1)
savi_2019_corr <- data.frame(year, age_corr, number_corr)
age_ext <- seq(5, 42, by = 1)
number_ext <- c(9, 2, 7, 0, 7, 2, 1, 4, 0, 1, 0, 0, 0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
savi_2019_ext <- data.frame(year, age_ext, number_ext)
These age-frequency datasets are then pooled into a single one that we will refer to as savi_1119 for both the corrected (_corr) and extended (_ext) datasets:
savi_1119_corr<-rbind(savi_2011_corr, savi_2013_corr, savi_2016_corr, savi_2019_corr)
savi_1119_ext<-rbind(savi_2011_ext, savi_2013_ext, savi_2016_ext, savi_2019_ext)
As done before, one can us the headtail() function of the FSA package to check that the different datasets have been correctly assembled into a single one for both the the corrected (_corr) and extended (_ext) datasets (not shown).
We will first run a GLMPoisson to model how counts (number_corr) decrease with age (age_corr) for the savi_1119_corr dataset to then use the the dispersiontest() of the AER package to determine how the counts are dispersed. The dispersion test requires to be performed on the original data (savi_1119_corr) and thus, not the ones that were artifically extended with zero-count age classes for the "final" analyses that will be performed in a second step. Below we will test the model with an interaction (INT):
summary(m_savi_1119_corr_INT_p <- glm(
number_corr ~ age_corr * year, family = poisson, data = savi_1119_corr))
Call:
glm(formula = number_corr ~ age_corr * year, family = poisson,
data = savi_1119_corr)
Deviance Residuals:
Min 1Q Median 3Q Max
-4.0381 -1.4866 -0.2506 1.3126 2.9845
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 960.412333 116.318862 8.257 < 2e-16 ***
age_corr -95.584647 17.530158 -5.453 4.96e-08 ***
year -0.474604 0.057740 -8.220 < 2e-16 ***
age_corr:year 0.047275 0.008699 5.434 5.50e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 428.35 on 33 degrees of freedom
Residual deviance: 101.05 on 30 degrees of freedom
AIC: 215.7
Number of Fisher Scoring iterations: 5
If this GLMPoisson would be adequate, the interaction term age_corr:year would provide statistical support at p < 0.001 for slopes that vary over the years. But as a rule of thumb, when the Residual deviance of the investigated model is greater than its associated degrees of freedom (i.e., 101.05 > 30), this is suggestive of overdispersion and as demonstrated before, of the inadequacy of using family=poisson in such case. However, let's pretend that we have no idea of how the counts may be dispersed. We will first use a two-tailed test to determine whether equidispersion appears to be respected:
library(AER)
dispersiontest(m_savi_1119_corr_INT_p, alternative = "two.sided")
Dispersion test
data: m.savi_1119_corr.INT.p
z = 3.3904, p-value = 0.0006978
alternative hypothesis: true dispersion is not equal to 1
sample estimates:
dispersion
2.725957
With a dispersion parameter = 2.73 (i.e., > 1) and an associated p-value = 0.0007, the counts are overdispersed. Analyzing the ADD model instead would provide the same conclusion.
We will now compare the 10 possible candidate models, i.e. the Poisson distribution family and its 4 considered extensions, contrasting the INT (changing Z) and ADD (stable Z) models each time. The two alternative GLMPoisson models would have been discarded based on the dispersion test results, but we will keep them just to show how they compare to their counterparts that are all equipped with an additional dispersion parameter to model the "extra" variance that is present in the counts being analyzed:
Additive effects (ADD) = apparent stability in mortality rates
summary(m_savi_1119_ext_ADD_p <- glm(
number_ext ~ age_ext + year, family = poisson, data = savi_1119_ext))
library(MASS)
summary(m_savi_1119_ext_ADD_nb2 <- glm.nb(
number_ext ~ age_ext + year, data = savi_1119_ext))
library(glmmTMB)
summary(m_savi_1119_ext_ADD_P <- glmmTMB(
number_ext ~ age_ext + year, family = poisson, data = savi_1119_ext))
summary(m_savi_1119_ext_ADD_NB2 <- glmmTMB(
number_ext ~ age_ext + year, family = nbinom2, data = savi_1119_ext))
summary(m_savi_1119_ext_ADD_NB1 <- glmmTMB(
number_ext ~ age_ext + year, family = nbinom1, data = savi_1119_ext))
summary(m_savi_1119_ext_ADD_CMP <- glmmTMB(
number_ext ~ age_ext + year, family = compois, data = savi_1119_ext))
summary(m_savi_1119_ext_ADD_GP <- glmmTMB(
number_ext ~ age_ext + year, family = genpois, data = savi_1119_ext))
Multiplicative effects (INT) = mortality rates vary with time
summary(m_savi_1119_ext_INT_p <- glm(
number_ext ~ age_ext * year, family = poisson, data = savi_1119_ext))
library(MASS)
summary(m_savi_1119_ext_INT_nb2 <- glm.nb(
number_ext ~ age_ext * year, data = savi_1119_ext))
library(glmmTMB)
summary(m_savi_1119_ext_INT_P <- glmmTMB(
number_ext ~ age_ext * year, family = poisson, data = savi_1119_ext))
summary(m_savi_1119_ext_INT_NB2 <- glmmTMB(
number_ext ~ age_ext * year, family = nbinom2, data = savi_1119_ext))
summary(m_savi_1119_ext_INT_NB1 <- glmmTMB(
number_ext ~ age_ext * year, family = nbinom1, data = savi_1119_ext))
summary(m_savi_1119_ext_INT_CMP <- glmmTMB(
number_ext ~ age_ext * year, family = compois, data = savi_1119_ext))
summary(m_savi_1119_ext_INT_GP <- glmmTMB(
number_ext~age_ext * year, family = genpois, data = savi_1119_ext))
To potentially save a significant amount of computational time regarding the assessment of model adequacy, we will first compare all the 10 candidate glmmTMB models to determine which one are the best supported under an information-theoretic approach, again usin the sample-adjusted AIC (AICc):
library(MuMIn)
model.sel(
m_savi_1119_ext_ADD_P,
m_savi_1119_ext_ADD_NB2,
m_savi_1119_ext_ADD_NB1,
m_savi_1119_ext_ADD_CMP,
m_savi_1119_ext_ADD_GP,
m_savi_1119_ext_INT_P,
m_savi_1119_ext_INT_NB2,
m_savi_1119_ext_INT_NB1,
m_savi_1119_ext_INT_CMP,
m_savi_1119_ext_INT_GP)
Model selection table
cnd((Int)) dsp((Int)) cnd(age_ext) cnd(yer) cnd(age_ext:yer) family df logLik AICc delta weight
m_savi_1119_ext_INT_CMP 867.2 + -87.0600 -0.42800 0.04299 cm(lg) 5 -99.574 209.7 0.00 0.341
m_savi_1119_ext_INT_NB1 851.0 + -76.9000 -0.42010 0.03796 n1(lg) 5 -99.920 210.3 0.69 0.241
m_savi_1119_ext_INT_GP 836.9 + -76.0400 -0.41310 0.03754 gn(lg) 5 -99.993 210.5 0.84 0.225
m_savi_1119_ext_INT_NB2 811.9 + -85.4700 -0.40050 0.04218 n2(lg) 5 -100.555 211.6 1.96 0.128
m_savi_1119_ext_ADD_GP 307.3 + -0.3987 -0.15030 gn(lg) 4 -103.286 214.9 5.25 0.025
m_savi_1119_ext_ADD_NB1 341.0 + -0.4013 -0.16700 n1(lg) 4 -103.465 215.3 5.61 0.021
m_savi_1119_ext_ADD_CMP 190.2 + -0.4519 -0.09199 cm(lg) 4 -104.193 216.7 7.07 0.010
m_savi_1119_ext_ADD_NB2 129.2 + -0.4712 -0.06162 n2(lg) 4 -104.253 216.8 7.19 0.009
m_savi_1119_ext_INT_P 931.1 + -87.9600 -0.45990 0.04345 ps(lg) 4 -122.256 252.8 43.19 0.000
m_savi_1119_ext_ADD_P 424.1 + -0.4223 -0.20820 ps(lg) 3 -135.822 277.8 68.19 0.000
Abbreviations:
family: cm(lg) = ‘compois(log)’, gn(lg) = ‘genpois(log)’, n1(lg) = ‘nbinom1(log)’, n2(lg) = ‘nbinom2(log)’, ps(lg) = ‘poisson(log)’
Models ranked by AICc(x)
The candidate models shown in red are discarded under a rather "strict" delta AICc threshold < 4 to be retained for statistical inferences, as previously discussed. As such, we won't assess their adequacy with the hnp package, but will assess that of the four remaining top-ranking models to make sure that they all adequately describe the process being examined from a statistical standpoint. Note that all the best-supported models include an interaction between age and year, clearly indicating that Z has changed over time. Having discarded all the ADD models due to their poor statistical support (weight of 2.5% or less), we are confident about the statistical confirmation of a decrease in A over time. Using the same scripts as for the saviBASK12 example, the hnp package is used regarding model adequacy:
INT_NB2: 1 hnp simulation: excellent (mean = 0.81%) 10 hnp simulations: excellent (mean = 0.97%)
INT_NB1: 1 hnp simulation: good (mean = 1.67%) 10 hnp simulations: good (mean = 1.21%)
INT_CMP: 1 hnp simulation: NA 10 hnp simulations: NA
INT_GP: 1 hnp simulation: good (mean = 1.61%) 10 hnp simulations: good (mean = 1.53%)
For INT_CMP, hnp was not able to perform a single simulation when requiring for(i in 1:1) in the scripts allowing to assess the fit of a model fitted with family = compois in n sequential simulations (here with n = 1). In fact, R has simply "freezed". We know however that the INT_CMP model has converged:
m.savi_1119_ext.INT.CMP$sdr$pdHess
[1] TRUE
As a last attempt to assess the adequacy of this model, we will again ask for a single hnp simulation but not through an iterative process as performed with the previous specified function. We will just apply the hnp() function to this object and ask to print a figure in the hope that this can be achieved. The dfun, sfun and ffun helper functions all still remain the same but the adequacy assessment related script is simplified as:
library(hnp)
hnp(m_savi_1119_ext_INT_CMP, newclass = TRUE, diagfun = dfun, simfun = sfun, fitfun = ffun,
how.many.out = TRUE, plot.sim = TRUE, paint = TRUE)
Total points: 124
Points out of envelope: 45 ( 36.29 %)
The hnp plot below clearly shows that too many Pearson residuals are found outside the simulated envelope (36.29%), especially between the Theoretical quantiles that vary from 0.5 to 1.5 but also between 2.0 and 2.5 on the x-axis according to this single hnp simulation and thus, the top-ranking model is inadequate and should be discarded. This result clearly highlights the duality between assessing model adequacy and performing model selection. Solely relying on the AICc or other information criterion for instance to identify the top-ranking model can be misleading, as an inadequate model can be incorrectly identified as the best one when it is not, as clearly demonstrated here. See Mainguy & Moral (2021) for another example illustrating such case where assessing model adequacy is in fact as important as performing model selection.
As previously mentioned, if this last step for model adequacy assessment would have not worked neither, the safest strategy would then be to simply discard the considered model. In our case, this strategy would have paid off, but we have been able to confirm that this model was inadequate. Another element that is suspicious in this case is the fact that the CMP extension normally "behaves" more similarly to the NB2 one (i.e., better models quadratic overdispersion), but from the model selection table the NB1 and GP extensions (which better model linear overdispersion in their) were found in-between the CMP and NB2. Normally (although not necessarily systematically) the CMP and NB2 should be more alike in terms of likelihood when modelling overdispersed data.
Note that the top-ranking model is now NB1_INT with AICc = 210.3 and a delta = 0.00, such that the "strict" delta AICc < 4 threshold used for a model to be kept still rejects GP_ADD (delta = 4.6). We will thus again perform the model selection steps as done before but by keeping only this time the NB1_INT, GP_INT and NB2_INT models and storing the resulting model selection table in a R object referred to below as out_savi_1119.
library(MuMIn)
out_savi_1119 <- model.sel(
m_savi_1119_ext_INT_NB2,
m_savi_1119_ext_INT_NB1,
m_savi_1119_ext_INT_GP)
out_savi_1119
Model selection table
cnd((Int)) dsp((Int)) cnd(age_ext) cnd(yer) cnd(age_ext:yer) family df logLik AICc delta weight
m_savi_1119_ext_INT_NB1 851.0 + -76.90 -0.4201 0.03796 n1(lg) 5 -99.920 210.3 0.00 0.407
m_savi_1119_ext_INT_GP 836.9 + -76.04 -0.4131 0.03754 gn(lg) 5 -99.993 210.5 0.15 0.378
m_savi_1119_ext_INT_NB2 811.9 + -85.47 -0.4005 0.04218 n2(lg) 5 -100.555 211.6 1.27 0.215
Abbreviations:
family: gn(lg) = ‘genpois(log)’, n1(lg) = ‘nbinom1(log)’, n2(lg) = ‘nbinom2(log)’
Models ranked by AICc(x)
So, in the end, it looks like the variance is more of a linear than a quadratic function of the mean for these count data, with the NB1 and GP extensions being better supported than the NB2 one. We will now generate a compromise from this model selection table with the model.avg() function that we have previously used and store its result as comp_savi_1119:
comp_savi_1119 <- model.avg(out_savi_1119, revised.var = TRUE)
comp_savi_1119
Call:
model.avg(object = out_savi_1119, revised.var = TRUE)
Component models:
‘123a’ ‘123b’ ‘123c’
Coefficients:
cond((Int)) cond(age_ext) cond(year) cond(age_ext:year)
full 837.2162 -78.41924 -0.4132423 0.03871336
subset 837.2162 -78.41924 -0.4132423 0.03871336
From this compromise, we will extract the age-related coefficients only to estimate Z and A:
coef_age <- coef(comp_savi_1119, full = TRUE)[[2]]
coef_age
[1] -78.41924
coef_age_year <- coef(comp_savi_1119, full = TRUE)[[4]]
coef_age_year
[1] 0.03871336
Because we are only analyzing the slope parameter, we can obtain predictions about Z vs. time strictly in a linear manner. In other words, the predictions about A can only be linear at the response scale. Let's recall again that Z is the absolute value of the combined age-related coefficients and as the relationship is linear, we will only need to estimate the predicted values at the start (2011) and the end (2019) of the considered period:
Z_2011 <- abs(coef_age + (coef_age_year * 2011))
A_2011 <- (1 - exp(-Z_2011)) * 100
A_2011
[1] 43.25908
Z_2019 <- abs(coef_age + (coef_age_year * 2019))
A_2019 <- (1 - exp(-Z_2019)) * 100
A_2019
[1] 22.66068
This compromise predicts a decline in A from 43.3% in 2011 to 22.7% in 2019, corresponding to an overall decrease of 20.6 percentage points or of 47.6% in total annual mortality rates that is illustrated below. Note that the survey-specific estimates represented by the filled circles were each independently obtained together with their associated uncertainty (error bars reflecting the 95% CI) to visually assess any potential temporal trend as a first step. The predictions for the temporal trend (i.e., solid black line) were obtained by analyzing all the age-frequency data as above in a single dataset and thus, this regression curve is not attempting to go through each of the independent survey-specific estimates. A linear model based on these (only) n = 4 observations would instead try to achieve this my minimizing the residual-sum-of-squares and would yield the predicted values represented by the dashed grey line, as a comparison. Although such linear model would be adequate regarding the parametric test assumptions of normality (Shapiro-Wilk test, W = 0.80, p = 0.09) and homoscedasticity (Breusch-Pagan test, BP = 0.004, p = 0.95), it clearly does not make use of all the available age-frequency data. Moreover, if we compare this linear model to its null counterpart, the second-order Akaike weight would be all (i.e., 100%) attributed to the null model predicting a temporal stability with a fixed A value over the period with such a small sample (i.e., lack of statistical power). The first approach is thus recommended. The two regression curves that we have just discussed are found below:
The adjusted D 2 of the least-supported adequate NB2 model (nb2, as it is not possible to our knowledge to extract the "deviance" residuals from a glmmTMB object) of this compromise behind those predictions is:
D2 <- 100 * (1 - m_savi_1119_ext_INT_nb2$deviance / m_savi_1119_ext_INT_nb2$null.deviance)
D2_adj <- 100 - (((124 - 1)/(124 - 5)) * (100 - D2))
D2_adj
[1] 89.50478
Alternativeley, if we perform a Spearman correlation test between the prediction of the INT.NB1, INT.GP and INT.NB2 models and the (unextended) observed data by some manipulations of the two _corr and _ext datasets, here for NB1 as an example:
savi_1119_corr <- transform(savi_1119_corr, age_ext = age_corr, number_ext = number_corr)
fitted <- predict(m_savi_1119_ext_INT_NB1, newdata = savi_1119_corr, type = "response")
cor_fit_obs <- cbind(savi_1119_corr, fitted)
cor.test(cor_fit_obs$number_corr, cor_fit_obs$fitted, method = "spearman", exact = FALSE)
Spearman's rank correlation rho
data: cor_fit_obs$number_corr and cor_fit_obs$fitted
S = 1091.8, p-value = 9.699e-10
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.8331881
The association between the predicted and observed values can be characterized as strong with rho = 0.83 and therefore, we have a pretty good calibration from our already adequate, top-ranking NB1 model. The same holds true for the GP and NB2 models with both nearly identical rho values to NB1 that also round to 0.83.
Altogether, our compromise is explaining about 90% of the observed variation for the extended dataset and the predicted values exhibit a good calibration with the observed values from the unextended dataset. Pretty good. This brings more confidence for our statistical inferences. Note that in this example, the progressive decrease in A over time was quite visually obvious and obtaining a statistical confirmation of such a declining pattern was thus expected. When A varies more erratically from one survey to the next with large associated uncertainty, the proposed approach analyzing all the descending limbs of the different consecutive catch curves in a single step will only provide the most likely global temporal linear trend in the data being analyzed. There might be place for analytical improvement regarding this, but this approach nonetheless provides an idea of how this life-history trait is behaving over time from a linear-only perspective.
References
Allen, M.S., & Hightower, J.E. 2010. Fish population dynamics: mortality, growth, and recruitment, pp. 43-79 in Hubert, W.A., & Quist, M.C., editors. Inland fisheries management in North America. 3rd ed. American Fisheries Society, Bethesda, Maryland, U.S.A. [PDF]
Anderson, J., Olsen, Z., Beeken, N., Weielman, R., & Fisher, M. 2022. Regional variation in growth and mortality of spotted seatrout in the western Gulf of Mexico. North American Journal of Fisheries Management 42: 1381–1397. [Journal]
Babyak, M.A. 2004. What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models. Psychosomatic Medecine 66: 411–421. [PDF]
Bartoń, K. 2023. MuMIn: Multi-Model Inference. R package version 1.47.5. URL: https://CRAN.R-project.org/package=MuMIn
Bradshaw, C.J.A., Mollet, H.F., & Meekan, M.G. 2007. Inferring population trends for the world's largest fish from mark-recapture estimates of survival. Journal of Animal Ecology 76: 480–489. [PDF]
Breusch, T.S., & Pagan, A.R. 1979. A simple test for heteroskedasticity and random coefficient variation. Econometrica 47: 1287–1294. [Journal]
Brooks, M.E., Kristensen, K., van Benthem, K.J., Magnusson, A., Berg, C.W., Nielsen, A., Skaug, H.J., Maechler, M., & Bolker, B.M. 2017. glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. The R Journal 9: 378–400. [PDF]
Brooks, M.E., Kristensen, K., Darrigo, M.R., Rubim, P., Uriarte, M., Bruna, E., & Bolker, B. M. 2019. Statistical modeling of patterns in annual reproductive rates. Ecology 100: e02706. [PDF]
Burnham, K.P., Anderson, D.R., & Huyvaert, K.P. 2011. AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behavioral Ecology & Sociobiology 65: 23–35. [Journal]
Cameron, A.C., & Trivedi, P.K. 1990. Regression-based tests for overdispersion in the Poisson model. Journal of Econometrics, 46: 347–364. [Journal]
Caza-Allard, I., Mazerolle, M.J., Harris, L.N., Malley, B.K., Tallman, R.F., Fisk, A.T., & Moore, J.-S. 2021. Annual survival probabilities of anadromous Arctic Char remain high and stable despite interannual differences in sea ice melt date. Arctic Science 7: 575–584. [PDF]
Chapman, D.G., & Robson, D.S. 1960. The analysis of a catch curve. Biometrics 16: 354–368. [Journal]
Comtois, D. 2022. summarytools: Tools to quickly and neatly summarize data. R package version 1.0.1. URL: https://CRAN.R-project.org/package=summarytools.
Consul, P.C. 1989. Generalized Poisson distributions: properties and applications. Marcel Dekker, New York, New York, U.S.A.
Dormann, C.F., Calabrese, J.M., Guillera-Arroita, G. Matechou, E., Bahn, V., Bartón, K., Beale, C.M., Ciuti, S., Elith, J., Gerstner, K., Guelat, J., Keil, P., Lahoz-Montfort, J.J., Pollock, L.J., Reineking, B., Roberts, D.R., Schröder, B., Thuiller, W., Warton, D.I., Wintle, B.A., Wood, S.N., Wüest, R.O., & Hartig, F. 2018. Model averaging in ecology: a review of Bayesian, information-theoretic, and tactical approaches for predictive inference. Ecological Monographs 88: 485–504. [Journal]
Dunn, A., Francis, R.I.C.C., & Doonan, I.J. 2002. Comparison of the Chapman-Robson and regression estimators of Z from catch-curve data when non-sampling stochastic error is present. Fisheries Research 59: 149–159. [PDF]
Efron, B. 1986. Double exponential families and their use in the generalized linear regression. Journal of the American Statistical Association 81: 709–721. [Journal]
Gelman, A., & Hill, J. 2007. Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, New York, New York, U.S.A. [Book]
Guisan, A., & Zimmermann, N.E. 2000. Predictive habitat distribution models in ecology. Ecological Modelling 135: 147–186. [PDF]
Hartig, F. 2022. DHARMa: Residual diagnostics for hierarchical (multi-level/mixed) regression models. R package version 0.4.6. URL: https://CRAN.R-project.org/package=DHARMa
Heincke, F. 1913. Investigations on the plaice, general report: 1. Plaice fishery and protective measures. International Council for the Exploration of the Sea, Copenhagen, Danemark.
Hinkle, D.E., Wiersma, W., and Jurs, S.G. 2003. Applied statistics for the behavioral sciences. 5th ed. Houghton Mifflin, Boston, Massachusetts, USA. [Book]
Hilbe, J.M. 2014. Modeling count data. Cambridge University Press, New York, New York, U.S.A. [PDF]
Huang, A. 2017. Mean-parametrized Conway–Maxwell–Poisson regression models for dispersed counts. Statistical Modelling 17: 359–380. [Journal]
Kleiber, C., & Zeileis, A. 2008. Applied econometrics with R. Springer-Verlag, New York, New York, U.S.A.
Lindén, A., & Mäntyniemi, S. 2011. Using the negative binomial distribution to model overdispersion in ecological count data. Ecology 92: 1414–1421. [PDF]
Lüdecke, D., Ben-Shachar, M., Patil, I., Waggoner, P., & Makowski, D. 2021. performance: an R package for assessment, comparison and testing of statistical models. Journal of Open Source Software 6: 3139. [PDF]
Maceina, M.J., & Bettoli, P.W. 1998. Variation in largemouth bass recruitment in four mainstream impoundments of the Tennessee River. North American Journal of Fisheries Management 18: 998–1003. [PDF]
Mainguy, J., & Moral, R.A. 2021. An improved method for the estimation and comparison of mortality rates in fish from catch-curve data. North American Journal of Fisheries Management 41: 1436–1453. [PDF]
Mainguy, J., Bélanger, M., Valiquette, E., Bernatchez, S., L'Italien, L., Millar, R.B., & Moral, R.A. 2024. Estimating fish mortality rates from catch curves: A plea for the abandonment of Ricker (1975)'s linear regression method. Journal of Fish Biology 104: 4–10. [PDF]
Millar, R.B. 2015. A better estimator of mortality rate from age-frequency data. Canadian Journal of Fisheries and Aquatic Sciences 72: 364–375. [PDF]
Miranda, L.E., & Bettoli, P.W. 2007. Mortality, pp. 229–277 in Guy, C.S. & Brown, M.L., editors. Analysis and interpretation of freshwater fisheries data. American Fisheries Society, Bethesda, Maryland, U.S.A. [PDF]
Moral, R.A., Hinde, J., & Demétrio, C.G.B. 2017. Half-normal plots and overdispersed models in R: the hnp package. Journal of Statistical Software 81: 1–23. [PDF]
Murphy, M.D. 1997. Bias in Chapman-Robson and least-squares estimators of mortality rates for steady-state populations. Fishery Bulletin 95: 863–868. [PDF]
Nelson, G.A. 2019. Bias in common catch-curve methods applied to age frequency data from fish surveys. ICES Journal of Marine Science 76: 2090–2101. [PDF]
Nelson, G.A. 2023. fishmethods: fishery science methods and models. R package version 1.12-0. URL: https://CRAN.R-project.org/package=fishmethods
Ogle, D.H. 2016. Introductory fisheries analyses with R. CRC Press, Boca Raton, Florida, U.S.A. [Book]
Ogle, D.H., Doll, J.C., Wheeler, P., & Dinno, A. 2022. FSA: Fisheries Stock Analysis. R package version 0.9.3. URL: https://github.com/fishR-Core-Team/FSA
O’Hara, R.B., & Kotze, D.J. 2010. Do not log-transform count data. Methods in Ecology and Evolution 1: 118–122. [PDF]
Pauly, D. 1984. Fish population dynamics in tropical waters: a manual for use with programmable calculators. International Center for Living Aquatic Resources Management, ICLARM Studies and Reviews 8, Manila. [PDF]
Pierce, J.L., Lauretta, M.V., Rezek, R.J. & Rehage, J.S. 2021. Survival of Florida largemouth bass in a coastal refuge habitat across years of varying drying severity. Transactions of the American Fisheries Society 150: 435–451. [PDF]
Richards, S.A. 2008. Dealing with overdispersed count data in applied ecology. Journal of Applied Ecology 45: 218–227.
Ricker, W.E. 1975. Computation and interpretation of biological statistics of fish populations. Fisheries Research Board of Canada Bulletin 191. [PDF]
Sáez-Castillo, A.J., & Conde-Sánchez, A. 2013. A hyper-Poisson regression model for overdispersed and underdispersed count data. Computational Statistics and Data Analysis 61: 148–157. [PDF]
Schenker, N., & Gentleman, J.F. 2001. On judging the significance of differences by examining the overlap between confidence intervals. The American Statistician 3: 182–186. [Journal]
Sellers, K.F., & Shmueli, G. 2010. A flexible regression model for count data. Annals of Applied Statistics 4:943–961. [PDF]
Shapiro, S.S., & Wilk, M.B. 1965. An analysis of variance test for normality (complete samples). Biometrika 52: 591–611. [PDF]
Smith, M.W., Then, A.Y. , Wor, C., Ralph, G. Pollock, K.H., & Hoenig, J.M. 2012. Recommendations for catch-curve analysis. North American Journal of Fisheries Management 32: 956–967. [PDF]
Slipke, J.W., & Maceina, M.J. 2014. Fishery analysis and modeling simulator (FAMS version 1.64 beta). American Fisheries Society, Bethesda, Maryland, U.S.A. [PDF]
St-Pierre, A.P., Shikon, V., & Schneider, D.C. 2018. Count data in biology – Data transformation or model reformation? Ecology and Evolution 8: 3077–3085. [PDF]
Steel, E.A., Kennedy, M.C., Cunningham, P.G., & Stanovick, J. S. 2013. Applied statistics in ecology: common pitfalls and simple solutions. Ecosphere 4: 115. [PDF]
Venables, W.N., & Ripley, B.D. 2002. Modern applied statistics with S. 4th ed. Springer, New York, New York, U.S.A.
Ver Hoef, J.M., & Boveng, P.L. 2007. Quasi-Poisson vs. negative binomial regression: how should we model overdispersed count data? Ecology 88: 2766–2772. [PDF]
Weisberg, S. 1980. Applied Linear Regression. Wiley, New York, New York, U.S.A.
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L.D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T.L., Miller, E., Bache, S.M., Müller, K., Ooms, J., Robinson, D., Seidel, D.P., Spinu, V., Takahashi, K., Vaughan, D., Wilke, C., Woo, K., & Yutani, H. 2019. Welcome to the Tidyverse. Journal of Open Source Software 4: 1686. [PDF]
Yamamoto, M., & Katayama, S. 2022. Age, growth, mortality, and yield-per-recruit of red spotted grouper Epinephelus akaara in the central Seto Inland Sea, Japan. Journal of Applied Ichthyology 38: 302–310. [PDF]
Zuur, A.F., Ieno, E.N., Walker, N.J., Saveliev, A.A., & Smith. G.M. 2009. Mixed effects models and extensions in ecology with R. Springer, New York, New York, U.S.A. [Book]