Lecture notes‎ > ‎

Lecture 12

For today you should have:

  1. Homework 8.
  2. Chapter 9.


  1. Bayesian estimation.
  2. Estimators.
  3. Locomotive problem.
  4. Evaluations.

For next time:

  1. Homework 9.
  2. Prepare for a quiz on Chapters 8 and 9.

The Coin Problem

From MacKay, "Information Theory, Inference, and Learning Algorithms," Exercise 3.15 (page 50): 
A statistical statement appeared in "The Guardian" on Friday January 4, 2002: When spun on edge 250 times, a Belgian one-euro coin came up heads 140 times and tails 110. 'It looks very suspicious to me,' said Barry Blight, a statistics lecturer at the London School of Economics. 'If the coin weere unbiased, the chance of getting a result as extreme as that would be less than 7%.'
MacKay asks, "But do these data give evidence that the coin is biased rather than fair?"
To answer that question, we will start with coin.py, which computes a Bayesian estimate of the parameter of a (possibly) biased coin.
Download and run it, and let's discuss.

Instead of the two hypotheses we saw in Chapter 7, it uses a suite of hypotheses to represent possible values of the parameter.

Exercise: Write a function that takes a posterior distribution (as a Pmf) and computes a credible interval.

But this does not exactly answer the question as posed by McKay: does the evidence support the hypothesis that the coin is biased?

Can we formalize the hypotheses "the coin is unbiased" and "the coin is biased" and compute a likelihood ratio?

Take a look at coin2.py

Exercise: If you started with the prior P(biased) = 0.1, what is your posterior?

Properties of estimators

The ones in the book are mean squared error and bias.  There are more here.

These are long-term properties of using an estimator for many iterations of the estimation game.

For any particular estimate, we don't know error.  If we did, we would know the answer and wouldn't need the estimator.

Which is better, an MLE or an estimator that minimizes MSE?

The Locomotive problem

This is adapted from Mosteller, Fifty Challenging Problems in Probability:

"A railroad numbers its locomotives in order 1..N.  One day you see a locomotive with the number 60.  Estimate how many locomotives the railroad has."
  • What is the maximum likelihood estimator?
  • What estimator minimizes mean squared error?  Hint: assume that the estimator is some multiple of the observed number.
  • Can you find an unbiased estimator?
  • For what value of N is 60 the expected value?
  • What is the Bayesian posterior distribution assuming a uniform prior?
Let's look at locomotive.py.

Exercise: generalize locomotive.py for a set of observations (not just one).

Practice quiz

1) The Wechsler Adult Intelligence Scale (WAIS) is meant to be a measure
of intelligence; scores are calibrated so that the mean and standard
deviation in the general population are 100 and 15.

Suppose that you wanted to predict someone's WAIS score based on their
SAT scores.  According to one study, there is a Pearson correlation of
0.72 between total SAT scores and WAIS scores.

If you applied your predictor to a large sample, what would you expect to
be the mean squared error (MSE) of your predictions?

Hint: What is the MSE if you always guess 100?

2) So far the examples we have seen of Bayesian estimation have
  been one-dimensional, but it can be extended to multiple dimensions.
  For example, to estimate the parameters of a normal distribution, we
  can create a suite of hypotheses where each hypothesis is a (mu,
  sigma) pair.

  Write a function that takes evidence and a hypothesis,
  and returns the likelihood of the evidence assuming the hypothesis.
  The evidence is a sequence of values from a normal distribution
  and the hypothesis is a tuple of floats representing values of
  mu and sigma.

  Hint: start with coin.Likelihood()

3) The Poisson distribution is described by the PMF:

  pmf(k) = 

where k is a non-negative integer and lambda is a positive real
parameter.  Suppose I draw a single value, i, from a Poisson distribution
with unknown parameter, and I use the observed value as an estimator;
that is, I choose lambda_hat  = f(i) = i.

a) (Easy) What is the probability of choosing i from a Poisson
distribution with parameter lambda?

b) (Easy) If I choose i and use lambda_hat  = f(i) = i as an estimator, what is the error?

c) (Harder) If I play the estimation game many times, what mean error do I expect?

d) (Easy if you got the last one) Is this estimator unbiased?  Explain in a sentence or two how you know.

4) For most couples the probability of having a boy is 50%, but
  due to a combination of genetic and environmental factors, 10% of
  couples have a 60% chance of having a boy, and 10% have a 60%
  chance of having a girl.

If a couple has had two girls, what is the probability that their next
child will also be a girl?

Allen Downey,
Oct 18, 2011, 8:07 AM