L2 C5 S3

S3: Probability and Observed Frequency

The most important difference to understand between pre and post experimental concepts is the difference between the probability, which exists in the pre-experimental world, and the observed frequency, which exists in the post-experimental world

PRE-EXPERIMENTAL

Suppose we make random draws R1, R2, ... , Rn from a population of 10 Students. All are EQUALLY LIKELY to be chosen. Every student has 1/10 probability of being chosen.

THIS PROBABILITY IS NOT OBSERVABLE -- the potential of what MIGHT happen, but has not happened, is hidden. This is why we use MODELS, where we can make such idealistic assumptions. In the real world, it is too complicated to figure out all the possibilities of what might or might not happen.

A random draw should be thought of as a PROCEDURE, a MECHANISM, or an EXPERIMENT for making a choice -- the mechanism should be such as to create equal choices for all.

POST-EXPERIMENTAL

After we draw a simple random sample, we OBSERVE the choices that were made. These choices are the OUTCOMES of the random draws, they are NO LONGER RANDOM,. The outcomes could be written as R1, R2, ..., Rn -- these outcomes are particular students that were selected by a random draw.

Suppose that there are 5 choices, then 5 students will NOT be in this simple random sample. The OBSERVED FREQUENCY -- the proportion of the students in the simple random sample that is drawn -- WILL BE ZERO, even though the probability is 10%. Understanding the difference between the post-experimental Observed Frequency and the pre-experimental Probability is CRUCIAL and CENTRAL.

The Law of Large Numbers says that in large samples, the observed frequencies will be approximately equal to the probabilities, so that there will be a rough match between the pre-experimental and post-experimental concepts. However in small samples the two can be very different. For example in a sample of size 1, the observed frequency for one individual will be 100% and for all others it will be 0%. This is very different from the equal 10% for all in the pre-experimental world.