Analysis of harvest age proportions

One of the most common sources of data in wildlife and fisheries management derives from the age and sex composition of harvested species. Often, managers would like to interepret age and sex structure in the harvest as revealing some underlying attributes of the population, either in absolute terms or in relative terms ("indices"). For example, if in a check station survey of 100 deer we find 25 fawns, 50 1 year olds, and 25 2+ year olds, we might be tempted to conclude that the true age proportions in those 3 classes are 0.25, 0.5, and 0.25, and perhaps get excited if the proportions shift next year to, say 0.05, 0.5, and 0.45. Resist the urge!

There are some problems with this kind of interpretation of harvest data, not the least of which is the obvious fact that different age, sex, or other components of the population are differentially harvested, so that harvest is anything but a random sample of the population. Managers might still be tempted to use statistics such as age or sex ratios as indices to population structure, but they are making a huge assumption that the relative rates of vulnerability are constant over time, so that the harvest data is always proportional to the popoulation structure (which we can't observed directly). Given that managers are constantly tinkering around with age- and sex- (or size-) specific regulations, this is a bold (my polite way of saying "stupid") assumption most of the time. What is called for is a means of empirically estimating the relative rates at which different components are harvested, independent of the harvest sample, and using this to adjust the harvest proportions.

I illustrate this with a very simple example from duck harvest data, focussing on age ratios, typically represented as number of young (HY birds) per adult (AHY birds), and often used as a measure of recruitment rates to the fall population. The main source we have for age ratios in ducks and other harvested gamebirds comes from samples of wings sent in by hunters. The wings are classed by species, sex, and age using plumage characteristics, during annual "wing bees" throughout the country, which are supervised by experts in age and sex (and species) determination. We are going to assume that 1) the hunters aren't making stuff up (e.g., hunting mallards in Oregon and saying they were in Georgia), and 2) the experts and their minions know what they're doing. In other words, we will assume that the wings are a fair representation of what occurred in the harvest.

Supposed we have collected a sample of 1000 mallard wings harvested during a particular year and location, and observe that 500 are from HY birds and 500 from AHY birds. A naive conclusion could be that the proportion of juveniles in the population is 0.5 or equivalently that the age ratio is 1.0 (i.e.,1 HY : 1 AHY ). However, suppose that I told you that the harvest rate for HY birds is approximately 2 x that of AHY birds on a per-capita basis. You should then expect that HY birds should appear in the harvest about twice as often as expected based on their occurrence in the population. A little bit of math in this example will show you that you would encountered 0.5 HY birds in the harvest sample if the true population proportion is 1/3, meaning that the age ratio is more like 1:2 than 1:1.

How do we know what the relative harvest rates are of the different age or other sex components of the population? The statistical answer is we don't, but if we have the right data we can estimate these rates, and use the estimates to appropriately correct for differential vulnerability. A standard way to do this-- which is in fact how it's done for ducks-- is to use band recovery data. Suppose that 1000 AHY and 1000 HY birds are banded before the hunting season, and 250 AHY and 500 HY bands are reported as shot or found dead during the hunting season. Ignoring for a moment the issue of band reporting (which should be equal for AHY and HY birds) we can use simple or direct recovery rates to get estimates of harvest rate for HY birds of hy= 500/1000=0.25 and of AHM birds of ha=250/100=0.25.

An ad-hoc approach is to simply adjust or discount the HY harvest (wing) reports by 0.25/0.5 and proceed with the adjusted numbers in computing age ratios. A better approach that a provides maximum likelhood estimates and confidence intervals is based on treating both the harvest (e.g., wing) data and the band or other tag recoveries as binomial outcomes and modeling them appropriately. Essentially the recovery data are modeled as binomial outcomes in order to estimate the age-specific harvest rates, and the harvest data are modeled as binomial outcomes with probability based on the harvest rate adjustments. Operationally this is done by maximizing the joint likelihood of these data structures in order to get ML estimates, variances, and confidence intervals. Alternatively, the same model can be implemented using a Bayesian approach and MCMC (e.g., in OpenBUGS).

The attached code assumes a vector of "real" age ratios over 10 years, and simulates harvest and band recovery data given a) a generally higher HY than AHY harvest rate, but b) with the relative vulnerability varying randomly. It then computes ML estimates from these data,and produces a data frame with 1) the actual age ratios, 2) the MLEs, and 3) the naive estimates. I graphed the ML results against 1) the "real" age ratios (black line), 2) naive estimates based on uncorrected harvest data (red line) , and 3) ML estimates with confidence intervals (green lines):

In general, the ML estimates do a good job in tracking the true age ratio, although it was close in year 2, with the lower confidence limit just below the true value (quiz: how often should this happen?). By contrast, the naive estimate was always above the true value, but not by a constant proportion.