You have already seen some applications of Bayesian modeling to generalized linear models and other problems that have corresponding likelihood solutions. We are now going to step back to some very basic models based on the binomial distribution and apply these to problems we are familiar with already, such as occupancy estimation, and estimation of survival and abundance from CMR data.
We are going to focus on the binomial distribution, especially its single-trial version, the Bernoulli distribution, because many of our estimation problems can be readily decomposed into what are recognizable as Bernoulli trials (we leave aside for now problems in which the data are counts rather than binary outcomes; these we will model by other distributions, like the Poisson). That is, regardless of how complex the problem appears, often it can be decomposed into 2 generic pieces:
A process component that describes how the true state of the system is behaving, whether we can observe it or not. So for example, is a site occupied or not? Is an animal present or absent in the area subject to sampling?
A sampling or detection component that describes the probability that we observe the state (occupied, present, etc.) given that it occurs.
Mathematically, we can describe the relationship the probability of the data (so, what we observe) as the produce of 2 conditional probabilities:
(Probability that observable state occurs [e.g that site is occupied]) X (Probability that detection occurs, given state occurs)
or
P(x=1)=P(z=1) P(x=1|z=1)
where z is the state (e.g., z=1 means site is occupied, z=0 means it's not) and x is the detection (x=1 detected, =0 not detected). Obviously, for example, this means that if a site is not occupied (z=0) there is no chance it can be detected, whereas if it is occupied (z=1) it still may be missed if P(x=1|z=1)<1.
There are several advantages of building models this way, even if we have a perfectly good likelihood solution already worked out.
It is a very clear way to see, in simple terms, just what is going on in terms of assumptions about the system and the data, and to keep issues about the process separate from issues about sampling
It is the natural way to build hierarchical models, because components piece together in a modular, conditionally independent way
It can be far more efficient, both code wise and computationally, to decompose problems into small pieces and link them together, than to try to solve a complex likelihood as a single piece.
It is the only practical way (that I know of) to build other than the simplest models for CMR and other data that involve random effects and hierarchical structure.
We will start with some simple binomial models that only involve a single, directly observable outcome (so, no separate state and detection process) to illustrate some of the mechanics of Bayesian estimation using OpenBugs and JAGS. We will then move into some examples where we apply Bayesian approaches to occupancy, abundance, and survival estimation, with comparisons to likelihood approaches where available. Before we're done we should have covered both models that you'd be happy to stick with the ML approach in RMark, unmarked, etc., and models where you are probably going to want to just suck it up and go Bayes.
Next: Bayesian occupancy