Solutions to the exercises in the 2nd edition: The solutions for exercises in Chapters 1  18 can be retrieved by clicking here.
For solutions to the exercises in the 1st edition, please click the 1st Edition link in the left margin.
[Related: For the syllabus of the graduate course I teach, click here.]
Additional exercises for the 2nd edition: Please contact the author if you have suggested exercises to add to the list below.
Ch. 
Purpose 
Exercise Statement 
Date, etc. 
2 
Thinking about noise in a posterior predictive check. 
Consider Figure 2.6, p. 29, in DBDA2E. Two of the data points fall above
the vertical bars. Does this mean that the model does not describe the data
well? Briefly explain your answer. 
20150119 
2 
Thinking about prior probabilities in reallocation of credibility in
disease diagnosis. 
Suppose there are two factories of trick coins. One factory creates tailbiased coins, such that the probability of T (an outcome we’ll call “negative”) is 95%. We’ll call this the “healthy” factory. The second factory creates headbiased coins, such that the probability of H (an outcome we’ll call “positive”) is 95%. We’ll call this the “disease” factory. (Just like real diagnostic tests, “positive” suggests having the disease.) Now, suppose you grab a coin at random from a sack of coins. You flip the coin and it comes up “positive.” Our goal is to think about the probability that the coin came from the healthy factory or the disease factory.
A. Suppose that the
sack of coins has 50% healthy coins. (This is your initial, prior allocation of
credibility across the two types of coins.) Given the “positive” outcome of the
single flip, should your reallocation of credibility end up with more
credibility on the healthy factory or the disease factory? Why?
B. Suppose that the
sack of coins has 99% healthy coins. (This is your initial, prior allocation of
credibility across the two types of coins, similar to real life.) Given the “positive”
outcome of the single flip, should your reallocation of credibility end up
with more credibility on the healthy factory or the disease factory? To answer
this question, do the following steps:
* Suppose the sack has
10,000 coins. How many of them would you expect to be healthy? (Hint: Of 10,000
coins, we expect 99% of them to be healthy. Compute the actual number.) How
many would you expect to be diseased?
* If you tested every healthy coin, how many would you expect
to come up positive? (Hint: For the number computed in the previous part, we
expect 5% to come up positive. Compute the actual number.)
* If you tested every diseased coin in the sack, how many
would you expect to come up positive?
* Out of all the coins
in the sack that you would expect to come up positive, what proportion of them
are diseased? (Hint: Compute the total number of coins that you would expect to
come up positive. Of that total, what proportion is from diseased coins? Your
answer should be 16.1%; show your work. By the way, this exercise is similar to
Exercise 5.2 of DBDA2E.)

20150119 
3 
Experience with cata files and indexing in R. 
Invoke RStudio with the DBDA2E program folder as its working directory. Read a csv file into a data frame as follows: datFram = read.csv( "HGN.csv" )
A. Display the first three rows of the data frame using explicit numerical indexing of rows in bracket notation.
B. Display the first three rows of the data frame using the head() function.
C. What is the class (e.g., factor, integer, etc.) of the $group component of the data frame? Use the class() function.
D. Change the $group component to a factor, using the factor() function, and display the result with its levels.
E. Display the Hair component of the data frame by referring to it three different ways: Using $ notation, using brackets with column name, and using brackets with column number.

20150119 
3 
Creating functions in R. 
Create a function in R that converts Fahrenheit temperatures to Celsius temperatures. Set the default input temperature to 72. Hint: C = (F32)*5/9.
A. Show the specification of the function.
B. Run the function with no explicit input, to reveal its default output.
C. Run the function with 98.6 as its input.
D. Run the function with the vector c( 32 , 72 , 98.6 , 212 ) as its input. Very briefly explain why the function produces a vector as output.

20150119 
4  Handson experience approximating probability density from a large sample, as in Figures 4.2 and 4.3.  Consider a normal probability density function that has mean of 10.0 and standard deviation of 0.20.
A. What is the exact probability density at x=9.9? Hint: dnorm(x,m,sd)
B. Generate a random sample of 100,000 values from the normal distribution. Hint: rnorm(N,m,sd)
C. Compute the approximate probability density at x=9.9 by counting the proportion of sampled points that falls between x=9.8 and x=10.0, and then dividing by the width of the bin. Show your code and explain what it does. Is the result approximately the same as Part A? (It should be.) Hint: Name the sample of values heapOdata; try sum( heapOdata >= 9.8 & heapOdata < 10.0 ).
 20150119  6  Experience converting from central tendency and concentration
of beta distribution to a and b shape parameters  A. Suppose a colleague
tells you she read a research report in which “there were about 50 patients and
twothirds of them survived at least one year after surgery.” Suppose you want
to use this information in an informal prior for subsequent data. (You’ll get
the exact numbers later.) What are the corresponding a and b shape parameters?
Hint: Convert from mean and concentration; why use the mean?
B. Suppose you are
given a sixsided die, and you want to find out whether it’s biased by
assessing the probability of rolling a 1dot outcome. You know the die just
came out of a brand new package from a reputable manufacturer, so you believe
that the most probable probability of a 1dot outcome is 1/6. Suppose it would
take 50 rolls to begin to sway you from your prior belief that the die is fair.
What are the corresponding a and b shape parameters? Hint: Convert from mode
and concentration; why use the mode?
 20150126  6  Seeing that various broad priors (all beta distributions) yield similar
posterior distributions for moderately large data.  For all these cases,
suppose the data show 63 heads in 97 flips.
A. Suppose the prior is
beta(thetaa=0.01,b=0.01), which is (approximately) the socalled Haldane
prior. What is mode and 95% HDI of the posterior distribution? Show the result
graphically. Hint: Use BernBeta.R.
B. Suppose the prior is
beta(thetaa=1,b=1), which is uniform. What is mode and 95% HDI of the
posterior distribution? Show the result graphically. Hint: Use BernBeta.R.
C. Suppose the prior is
beta(thetaa=2,b=4), which is gently biased toward tails. What is mode and 95%
HDI of the posterior distribution? Show the result graphically. Hint: Use BernBeta.R.
D. Is there much
difference in the posterior distributions (modes and HDI’s) across the
different priors? Hint: See the purpose of the exercise.
 20150126  7  Another look at autocorrelation, similar to Figure 7.12; and, an R
exercise.  Consider the following R
code:
source("DBDA2Eutilities.R")
openGraph(width=5,height=7.5)
layout( matrix(1:6,nrow=3)
)
par(mar=c(4,4,3,1),mgp=c(2,0.7,0))
maxLag=10
for ( inert in c(0.99,0.80)
) {
x = rep(0,500)
for ( i in 2:length(x) ) {
x[i] = inert*x[i1] + rnorm(1,0,1inert)
}
x = (xmean(x))/sd(x)
#
plot( x , ylab="x" , cex.lab=1.5 ,
type="l" )
#
plot( x[1:(500maxLag)] , x[(maxLag+1):500] ,
xlab=bquote("x[i]") ,
ylab=bquote("x[i+"*.(maxLag)*"]") ,
cex.lab=1.5 , cex.main=1.5 ,
main=bquote(r==.(round(cor(x[1:(500maxLag)],x[(maxLag+1):500]),2))) )
#
acfInfo = acf( x , lag.max=maxLag+2 ,
main="" )
title(
main=bquote("acf["*.(maxLag)*"] =
"*.(round(acfInfo$acf[maxLag+1],2)) ) ,
cex.main=1.5 )
points( maxLag , acfInfo$acf[maxLag+1] ,
cex=2 , lwd=3 )
}
A. Explain, briefly,
what every line does, perhaps by adding comments before or at the end of each
line.
B. Run the code. It
produces a graphical output; include the graph in your writeup.
C. Explain the scatterplots
in the second row, including the relation of the scatterplot to the traces in
the first row. Also explain the number in the title of the scatterplot.
D. Explain the bar
graphs in the third row, including the circle on one of the bars, and the
relation of that bar to the scatterplot in the second row.
E. What is the key
difference between the columns of the graph? (Only a brief answer is wanted
here.)
  8  See that the JAGS answer matches the beta distribution from an analytical solution.  Immediately after Exercise 8.1: Here our goal is to superimpose an analytically correct beta distribution curve on the MCMC histogram produced by JAGS. After the end of the script of Exercise 8.1, run the R code below, with the boxes □ replaced with the appropriate values: openGraph() plotPost( mcmcCoda[,"theta[1]"] , xlim=c(0,1) ) a = □ ; b = □ # constants from prior in JagsYdichXnomSsubjMbernBeta.R H = □ ; T = □ # heads and tails from your data thetaGrid=seq(0,1,length=201) lines( thetaGrid , dbeta( thetaGrid , a+H , b+T ) ) Explain how you determined the a and b values. Explain the meaning of the lines() command, being sure to relate it to Equation 6.8. Include the resulting graph in your writeup. (The superimposed curve should closely match the histogram.)
 20150202  8  Getting familiar with some details of the JAGS program, “under the hood.”  A. Consider the data and script your used for Exercise 8.1. Rerun it two ways: genMCMC( data=myData , numSavedSteps=2000 ) genMCMC( data=myData , numSavedSteps=50000 ) For both, include the diagnostic plot of theta[1] in your writeup. Explain what the numSavedSteps argument does, and describe its effect on the diagnostic plot.
B. In the program JagsYdichXnomSsubjMbernBeta.R, change these lines jagsModel = jags.model( "TEMPmodel.txt" , data=dataList , inits=initsList , n.chains=nChains , n.adapt=adaptSteps ) to this: jagsModel = jags.model( "TEMPmodel.txt" , data=dataList , # inits=initsList , n.chains=nChains , n.adapt=adaptSteps ) Then save the modified program and rerun the highlevel script. What did the change in the program line do to JAGS? That is, what did JAGS do differently? Did the MCMC output change in any noticeable way?
 20150201  8  Preview of the ROPE argument in plotPost(), and prelude of the SavageDickey method for Bayes factors in Section 12.3 and Exercise 12.1.  A. Consider the data and script you used for Exercise 8.1. This time, after running the main script, run these lines: openGraph() plotPost( mcmcCoda[,"theta[1]"] , xlim=c(0,1) , ROPE=c(0.45,0.55) ) Notice the ROPE argument; describe what it does to the graph and what the annotation in the graph means. In particular, how much of the posterior distribution falls inside the ROPE? B. Rerun the script but with the data commented out, so that JAGS samples from the prior distribution (recall Exercise 8.4). Again run the lines of part A. How much of prior distribution falls inside the ROPE? C. What is the ratio of the posterior probability inside the ROPE to the prior probability inside the ROPE? (You can think of this as the shift in credibility of theta values near chance.)  20150202  9  Handson experience with the gamma distribution  Make plots like those
in Figure 9.8, p. 237. You don’t have to make the annotation exactly match the
look of Figure 9.8, but make sure that the shape, rate, mean, sd, and mode show
up somewhere in the figure. Hint: Start with mean or mode and sd, and use
Equations 9.7 or 9.8 (or corresponding R functions) to determine the shape and
rate. Extra special bonus hint: Embellish code like the following
kappa = seq( 0 , 150 ,
length=1001 )
source("DBDA2Eutilities.R")
sr = gammaShRaFromModeSD(
mode=18 , sd=40 )
plot( kappa , dgamma(
kappa , shape=sr$shape , rate=sr$rate ) , type="l" )
 20150209  9  Handson computation of likelihood in a hierarchical model, to see that
shrinkage really does yield higher likelihood  In this exercise,
you’ll use R to compute the likelihood value in Equation 9.10, p. 247.
Consider four coins,
each flipped 4 times, with these outcomes:
{y_{is=1}}=1,0,0,0 {y_{is=2}}=1,1,0,0 {y_{is=3}}=1,1,0,0 {y_{is=4}}=1,1,1,0
A. What are the
proportions of heads for each coin?
B. What is the value of
the likelihood when w=0.5, k=2, q_{1}=0.25, q_{2}=0.50, q_{3}=0.50, and q_{4}=0.75? Show your computations,
perhaps in R. Do these parameter values constitute any shrinkage relative to
the data proportions? Is the shape of the beta distribution flat or peaked?
C. What is the value of
the likelihood when w=0.5, k=20, q_{1}=0.35, q_{2}=0.50, q_{3}=0.50, and q_{4}=0.65? Show your computations,
perhaps in R. Do these parameter values constitute any shrinkage relative to
the data proportions? Is the shape of the beta distribution flat or peaked?
Which set of parameter values, these or the previous part, yield a higher
likelihood value for the data?
 20150209  10  Practice and intuition building with the modelcomparison example in
Section 10.2.1, using different data and priors  Consider the example in
Section 10.2.1, regarding comparison of headbiased and tailbiased factories
as models for a coin. Redo the example with the following data sets, factory
biases, and prior probabilities of the factories. For each case, show the R
commands you used and their results, and show the algebra you used to compute
the Bayes factor and the posterior probability of each factory (in R, if you
like).
A. z=3 , N=9 , w_{1}=0.25 , w_{2}=1w_{1 }, k=12
, p(m=1)=0.5 . State what is different about this example relative to the book
example. How do the results compare with the results in the book example?
(Hint: Your results should be just the “flip” of the results in the book
example.)
B. z=6 , N=9 , w_{1}=0.25 , w_{2}=1w_{1 }, k=120
, p(m=1)=0.5 . State what is different about this example relative to the book
example. How do the results compare with the results in the book example?
C. z=6 , N=9 , w_{1}=0.025 , w_{2}=1w_{1 }, k=12
, p(m=1)=0.5 . State what is different about this example relative to the book
example. How do the results compare with the results in the book example?
D. z=6 , N=9 , w_{1}=0.25 , w_{2}=1w_{1 }, k=12
, p(m=1)=0.05 . State what is different about this example relative to the book
example. How do the results compare with the results in the book example?
 20150209 
In the file list below, click only on the download arrow at the right, not on the file name or version number. 
Updating...
Ċ John K. Kruschke, Feb 16, 2015, 1:22 PM
