P&S L2 C1 S4

S4: Failure of IID Assumptions

It is important to recognize situations where the Bernoulli model can be expected to work well, and distinguish them from real world situations which cannot be modelled well by Bernoulli sequences. If the assumptions of the model provide a good match to the real world circumstances, the model can be expected to work well for many purposes. If the assumptions do not match well, the model may not be so useful. HOWEVER, this is ALWAYS a practical matter, since models are NEVER a perfect representation of reality. So it is always a question of IS THE MODEL A GOOD ENOUGH APPROXIMATION FOR THE PURPOSE AT HAND? and it is NEVER a question of whether the model is a perfect match for all aspects of the reality being modelled.

Sampling without replacement. Suppose we have a population of 10 students and we make a random draw. Then we take the randomly drawn student out of the pool, so we have 9 students left. Then we make another simple random draw. This procedure is called sampling without replacement.

In this case, we DO NOT have IDENTICAL distributions. After the first person is TAKEN OUT, the probabilities for drawing of each person CHANGE to 1/9, which is different from the 1/10. We also do not have INDEPENDENCE because what happens on the second draw DEPENDS on what happened in the FIRST DRAW. So this is NOT a real world situation which is modelled well by a sequence of IID Bernoulli random draws.

Sampling With Replacement: This is the standard theoretical model for sampling. We draw one person at random. THEN we put the person BACK into the population. THEN we make the second draw. This means that the person selected first has a chance to be chosed AGAIN. In sampling without replacement, the person is taken out, and CANNOT be chosen again.

THOUGHT EXERCISE> Consider the example of the previous slide. We consider that all patients are given a drug, and they have a probability p of being cured. This allows us to model the situation as IID Bernoulli random variables. EXPLAIN why this MAY NOT BE A GOOD MODEL. Explain how the distributions may not be identical, and also explain why the trials may not be independent.

Page updated

Google Sites

Report abuse