Code for "Estimating county-level mortality rates using highly censored data from CDC WONDER"

The Technical Appendix for the paper can be found here.

On this page, I will walk you through the steps to recreate the results in my "Estimating county-level mortality rates using highly censored data from CDC WONDER" paper. This walkthrough assumes you already have R (and, if necessary, WinBUGS) installed on your computer.

Step 1: Download "WONDER.zip" and unzip the file.

Step 2: Open R and change your working directory to the unzipped "WONDER" folder. To do this in R, you can click on the "R Console" window, then click "File/Change dir...", and navigate to the "WONDER" folder.

Step 3: Open and run the file "0_prep_data_&_shapefiles.R" file. This file sets up the mortality data and state/county shapefiles. Strictly speaking, the output from these files is already saved in the WONDER.zip file, but this code is provided to help you adapt the code for your own purposes. Most importantly:

  • The data will ultimately need to be stored as a 2-dimensional array with dimensions (# of regions, # of groups).

  • The code assumes that the ordering of the data is the same as the ordering of the shapefile. The easiest way of achieving this is to order both the data and the regions of the shapefile by their FIPS codes.

Step 4: To fit the MCAR model, open the "1_model_mcar_supp.R" file. At the beginning of this file, there is an option to fit the model for the entire (contiguous) United States or for a single county.

mystate='' #set to '' to replicate analysis from paper

source('Code/Helper_Code/setup_HD.r')

If your goal is to reproduce the results from the paper, then you'll want to keep mystate=''; otherwise, simply set mystate equal to the two-letter abbreviation for your state of choice (e.g., mystate='TX'). It should be noted that running the MCAR model on the entire (contiguous) US takes ~4 hours in R (but only ~20 minutes in WinBUGS) on my laptop.

As written, the code begins by saving the true (in this case unsuppressed) death counts as Y0. The code then constructs the design matrix, X -- if you have covariates of interest, they can be included here.

X=array(dim=c(Ns,1))

X[,1]=1

#popden=apply(n,1,sum)/us$SQUARE_MIL #example covariate

#X[,2]=log(popden) #how to include covariate info

p=dim(X)[2]

Next, the code implements the suppression criteria that would have been in place had these data come from 1989 or later.

dY=array(Y>=10,dim=dim(Y))

Y[!dY]=NA

The code then specifies prior distributions for each of the model parameters -- see Appendix A.2 for details/justifications for these choices.

Step 5: The end of this code file provides three options for how to analyze these data:

Option 1: Fit the MCAR model in WinBUGS

#Option 1: Fit the MCAR model in WinBUGS (<20 Minutes)

#source('Code/Helper_Code/setup_BUGS.R')

After running the "source()" line of code, proceed to the instructions for fitting the model in WinBUGS.

Option 2: Fit the MCAR model in R

#Option 2: Fit the MCAR model in R (currently quite slow)

#source('Code/Helper_Code/z_mcarfit.R')

Option 3: Fit the Poisson-gamma model in R

#Option 3: Fit the Poisson-Gamma model in R (not too bad)

#source('Code/Helper_Code/z_pgfit.R')

To reproduce the results from the paper, you will need to run both Options 2 and 3.

Step 6: To reproduce the results from the paper, open "Code/2_results.R". If both the MCAR and Poisson-gamma models have been fit in R (and thus if their output has been saved in the WONDER folder), this file will produce the following:

    • The correlation and DIC results displayed in Table 1 of the manuscript

    • The age-standardized rate maps

    • The age-specific rate maps