There will typically be several steps required to get a set of models going in the MARK interface. The essential steps common to most problems involve (1) reading in data and creating a MARK project, (2) specifying and running a series of models, and (3) summarizing the results and producing model output. We can illustrate these steps with the Dipper example, a problem involving open CMR data and a CJS model structure.
READING IN DATA AND CREATING A MARK PROJECT
In this step, we start with a file that is in MARK *.inp format that contains the CMR data (see the discussion in Lab 3 about MARK vs. RMark format). We will then read the data into the MARK interface, which will create a MARK project and a series of files that by default start with the prefix of the data input file name. So, in our example the data are in Dipper.inp and so the project files will all start with DIPPER.* (note: MARK is not case sensitive for file names, unlike RMark, so Dipper.inp , dipper.inp, and DIPPER.INP are all treated the same).
First, save the Dipper.inp file to your working directory.
Open the MARK interface and click on the blue box in the upper left corner.
In the Dipper example the data consist of 7 capture occasions with 2 attribute groups (sex= Male, Female), and the model structure is Live (CJS). We specify this information in the input menu, then browse and click to select the Dipper.inp input file (hopefully, located in your working directory), and select "OK" once we are satisfied.
If we have followed these steps correctly, we will be back in the MARK menu where we can run models. At this point, we will have loaded the data into a MARK project, and so the previous step would not need to be repeated if we leave the MARK session and return later. In that case, we would simply open the MARK project by clicking on the yellow folder icon in the upper left
Note: if we have made an error in the the preceding steps, we will need to reload the data (re-initialize the project), there is no way to go back and "edit" the contents of the MARK project.
BUILDING AND RUNNING MODELS
If we have successfully initialized the MARK project, then we will get a MARK screen that displays the PIM for the first parameter type and group (in this case Phi for males).
We can also see the other PIMS by going the PIM/Open Parameter Index Matrix/Select all. For this example we would see
We are now ready to begin building some models.
Method 1 - Pre-defined models from model menu
The simplest method for building and running models is menu-driven approach for pre-defined models . This is enable by selecting Run/predefined models from the MARK menu
From the resulting menu, click on Select Models, which will open up tabs for each of the model types (Phi and p). Indicate 'select all' for each (first Phi and then p). This should result in 16 models being created
Click "OK" and then "Ok to run". This will run 16 PIM-based models creating all combinations of Phi models and p models. These are kept sorted by a simple notation convention developed by Lebreton et al. (1992) and used in the original Dipper example, where for each parameter type g means group variation, t means time variation, g*t means group*time interaction, and dot (.) means constant, resulting in 4 combinations for Phi X 4 for p =16. Note that these do not include the additive models (g+t etc. ) that require invocation of the design matrix. We'll come back to these later.
Once the models all run (which is pretty quick in this example) we will get a summary table that contains the model name and various diagnostics including AIC. By default the models are listed in ascending order by AIC, with the lowest (best) AIC model on top.
We'll return to this table and to other output in more detail below.
Method 2 - Manipulating Parameter Index Matrices (PIMS) directly
The MARK interface enable users to create standard or customized models by direct manipulation of the PIMs. This can be accomplished either by opening the individual PIM matrices, or via some graphical tools.
Opening and editing PIM matrices
From the main MARK menu, select PIM/ open parameter index matrix. Select Apparent Surivival Parameter (Phi) for males and females. You should get something like this, which are the default g*t PIM:
You can manipulate these PIMs by changing the index entries. For example, setting all the entries to 1 for the males and 2 for the females eliminates time variation (you can do this manually, but it is easy to right click on each window and (for example) select 'constant' to set all parameters equal across occasions. Here are the PIMS that will produce the Phi(g) parameterization
Likewise we can produce Phi(t) by stipulating time but not group variation:
One you create a model you can run it by clicking on the green run arrow in one of the PIMs (which doesn't matter).
Clicking on the yellow arrow saves a model for later run (but produces no output). In either case, you must assign a name for the model so that you can keep track of the output. I strongly recommend using the convention Phi(g*t) p(.) etc
PIM Chart tool
The PIM chart is a graphical tool that can be useful for creating and modifying PIMS. The tool is opened from the MARK menu PIM/Parameter Index Chart. It will operate on whatever PIM structure is currently open, initially the All Time (g*t) model.
A given model can be changed by click and drag operations. For instance, Phi for males and females are set equal by dragging the female box over the top of the male box.
This creates the Phi(t) p(g*t) model. We could then go to the p PIMS , right click, and select constant (no time variation)
This is Phi(t) p(g). We could proceed like this building and running (or saving a series of models) from the green or yellow arrows on the Results Browser. Note that as with the PIM chart approach, we would have to assign the model a name in order to be able to keep track of output. Also note that there are 'gaps' in the parameter index numbering, which don't matter: MARK will automatically renumber indices eliminating the gaps
Method 3 - Modifying PIM models via Design Matrices
As described above, PIMs create the basic model structure. A give PIM structure can also be a departure point for many additional and more general models that are not possible with just PIMs, such as:
Additive group and time effects
Time-specific covariates
Individual covariates
Interactions between group, time and covariate effects
A key point to remember about Design Matrices (DM) is that they basically create special cases of the models specified by the PIMs. Thus, while one can modify the time, group and other effects that occur in the PIM structure by the DM, a DM can never create an effect that doesn't exist in the original PIM. Thus, in general, MARK models will always be a combination of PIMs and a DM, although in some cases the relationship will be very simple (e.g., identity).
We can illustrate with the Dipper example. If we start with the general PIM structure for Phi(g*t)p(g*t) we have 24 parameters (12 Phi, 12 p). With the Dipper project open, go the the MARK results browser, find this model, right click on the line in the browser, and click 'Retrieve'. Then go the the MARK menu, and select "Design/ Full" . You will get something like:
Note that this matrix has 24 rows, 1 for each parameter of the PIMs. It also has 24 columns. In general, the columns of the design matrix are going to correspond to the actual parameters being estimated, which in this case is also 24. Focus on the upper left corner (12X12) of the DM; this is the Phi portion, with 12 parameter index rows and 12 parameters (the 'beta' columns).
For those acquainted with the generalized linear models the above way of representing a model should be familiar. It is of the form of a linear equation, with a response on the left side and and coefficients (the beta's) multiplied times effects (1's or 0's above, more general below). An added wrinkle is that the response in these models typically involves a transformation. So, in the model above the response is
logit(Phi_group,time) = Beta1+Beta2*Phig1+Beta3*Phit1+Beta4*Phit2+.....Beta12*Phig1*Phit5
and
logit(p_group,time) = Beta13+Beta14*pg1+Beta15*pt1+Beta16*pt2+.....Beta24*pg1*pt5
where logit is a transformation that forces the response to be on the interval (0,1) e.g.,
Y=logit(p) = log[p/(1-p)]
The logit transformation is an example of a link function. Link functions are always used in MARK/ RMark, although the are sometimes simple, such as the link function is the response itself , Y=S. This would be the identity link function; other common link functions are log, sin, and -log(log)
The actual calculations in MARK are performed using the link function, and these produce estimates of the beta coefficients of the design matrix. MARK also produces estimates of the real parameters, which for the CJS model are Phi and p for each time period and group in the PIM.
Additive model
The above DM provides estimates for the full time- and group specific model where time and group effects are independent / interactive. If we eliminate all of the terms (columns) where group and time interact (g*t, columns 8-12), we have a model that is additive for time and group effects. That is, we have
logit(Phi_group,time) = Beta1+Beta2*Phig1+Beta3*Phit1+Beta4*Phit2+....Beta7*Phit5
We can perform a similar simplification for the p parameters. The overall result is a model that we can term Phi(g+t)p(g+t) in contrast to Phi(g*t) p(g*t).
Covariates
Time- and individual-specific covariates follow naturally from the DM structure. Suppose we want to model survival as an additive response to group and temperature, measured at each occasion. The model equation is now
logit(Phi_group,time) = Beta1+Beta2*Phig1+Beta3*Temp
and the design matrix is
We will return to covariate modeling in a later exercise.(t)p(t). Go to the Results Browser and retrieve this model (right click/ Retrieve). Check out the PIMS, and notice that there are 12 parameters (6 time specific Phi, and 6 p). Now go to the main menu and select 'Design' / Reduced with the default number of (12). Note: " Full" is only an option for the fully interactive PIM model Phi(g*t) p(g*t) . You will get a template that has 12 rows and 12 columns, which can then be filled in for the appropriate model-- but only a model involving time (not group) effects:
For comparison, select the Phi(g) p(g) model. Now you have 4 parameters, so the most complex design matrix would have 4 rows and 4 columns and would have an intercept and group effects for both Phi and p:
Another model could be created that eliminates Phi group effects, and so has 3 parameters:
However, we cannot add back in time effects, because the PIM that we started with does not contain time.
Finally, we will see an 'exception' to the rule that the DM cannot be more complex than the PIM later on, when we consider individual covariates.
Once we have built a model in a DM we can run (save) the model using the green (yellow) arrow icons at the top of the DM specification window. As with custom built models from PIMs, we need to assign the model a descriptive name that we will be able to recognize in output.
SUMMARIZING RESULTS AND PRODUCING OUTPUT
I created the above 2 design matrix models Phi(g) p(g) and Phi(.)p(g) and added them to the 16 models created by the PIM approach. First, we can view the Results Browser of of models, and if we wish output this to a text table or spreadsheet; Output/Table of Model Results/ Editor produces a temporary text file, which I saved here. There are now 18 models, but notice that some of these appear identical (same likelihood, AIC value, etc.) which should not be surprising-- they are the same model but formed under 2 different transformations of the parameters.
There are many types of output that can be viewed and saved in MARK, but for now we will focus on model-specific results (real parameters, betas) and model averaged real parameter estimates. The model-specific results are gotten by going to specific model in the Results Browser, right clicking and selecting 'Real Estimates' or 'Beta Estimates', which are then opened as temporary text files in the Notepad editor. For example, here are the real and beta estimates for the phi(t)p(.) PIM model. Again, the real parameter estimates are the estimates of group- and time-specific Phi and p, and the Beta estimates are the estimates of the underlying design matrix; in the case of PIM models these are by default on the sine scale. Here are the real and beta estimates for the Phi(g)p(g) DM model ; notice that the beta estimates are on the logit scale, which is the default link function for PIM models. Finally, if you compare the real parameter estimates for equivalent PIM and DM models, you should see very little if any difference; again, this is due to the fact that the models only differ in the link function (of course, the betas will be different).
MARK also has capacity to compute and produce as output model-averaged estimates of real (and where they exist, derived) parameters. For the Dipper example, go to the main menu: Output/Model Averaging/Real. You will see windows for all the real parameter types; click on 'select top row' for each panel (indicating that this is a time but not cohort specific problem). When completed, check the box "Estimates into an Excel spreadsheet.". I did this and the spreadsheet is here (you will also get a bunch of detailed output for each parameter including its values for every parameter. We will get into model averaging in more depth in Lab 6.
A couple of quick comments. First is that in practice you would not want to include 2 identical models in the list of models over which you are doing model averaging. These are the same thing and do not belong in your model set twice. More important, you will see there is a lot of information here and not all of it is very useful (do we really need to look at all those year- and time-specific estimates?) . We are going to devote more time later to sensible ways to summarize and display what can become overwhelming amounts of "information".
Next: Creating MARK models using the scripting approach in RMark