S5 Convergence of Observed Proportions to Probabilities

Consider a sequence of Bernoulli Random Variables X(1), X(2), ..., X(N) which are IID Bi(1,p). Note that Bi(1,p) is the same as a Bernoulli distribution. Let S be the SUM of these Bernoulli Random Variables. Then S is Bi(N,p). The probability that any one of the X(i) equals 1 is p -- this is the true underlying probability which governs the generation of 1's and 0's for this sequence. However, this probability is NOT OBSERVED and is NOT OBSERVABLE. This is because the probability exists in the PRE-EXPERIMENTAL world, where we are conducting the experiment which has two possible outcome. The probability exists as an unrealized potential -- the possibility of occurence of 1 or 0 --. AFTER the experiment, one of these two outcomes comes into existence, and probabilities and possibilities are extinguished. We only see a 1 or a 0 -- the possibility of the other outcome no longer exists. POST EXPERIMENTALLY, it is no longer true that P(X=1)=p -- if a 1 has occurred than the probability has become 100% while if a 0 occurs than the probability of 1 become 0%. So HOW does probability make itself apparent or visible in the real world? It does so be forcing the OBSERVED proportion of 1's to be close to the theoretical probability of 1's. We FORMULATE the mathematical theorem below, and provide a numerical proof.

Let S be the sum of the Bernoulli's as above. Then S is the number of 1's which were observed POST EXPERIMENTALLY after we observed the outcomes of all the Bernoulli Random Variables. S/N is the observed frequency of 1's -- the proportion of 1's in the total sequence of N outcomes. The law of large numbers says that for large N, S/N converges to p. This law can be given a more precise expression as follows. Let s,t be two probabilities such that s<p<t. SO s is a probability which is smaller than p and t is a probability which is larger than p. Consider the probability that S/N > t. This is a probability of a deviation of S/N from the true probability p -- the deviation is of size t-p>0. We can show that the probability of this deviation goes to zero. Let J be the smallest integer greater than or equal to tN -- this is a proportion of outcomes which is LARGER than p.

P(S/N>t)= P(S>Nt)= P(S=J)+P(S=J+1)+P(S=J+2)+...+P(S=N-1)+P(S=N) = 1 - BINOMDIST(J-1,N,p)

This sum goes to 0 as N goes to infinity. This can be proven mathematically. We will prove it numerically using EXCEL later.

What this means is that for ANY proportion t bigger than p, as N grows larges P(S/N>t) goes to 0 and (P(S/N<t) goes to 1. The proportion S/N is smaller than t with probability close to 1 and S/N is bigger than t with probability close to 0.

The same holds on the other side with a proportion s being less than p. If s<p, let K be the largest integer less than sN, so that K/N is a proportion which is smaller than s. The P(S/N<s) goes to 0:

P(S<Ns)=P(S=1)+P(S=2)+ ... P(S=K-2)+P(S=K-1)+P(S=K) = BINOMDIST(K,N,p)

This means that for ANY proportion s smaller than p, as N grows larger P(S/N<s) goes to 0 and P(S/N>s) goes to 1.

Combining these two results we see that P(s<S/N<t) must converge to 1. The interval (s,t) is called a NEIGHBORHOOD of p, because it contains p -- in thinking of this as a neighborhood, we are thinking that s and t are both close to p so that the distance between s and p and p and t is small. The law of large numbers says that for any neighborhood of p, the probability that the observed proportion S/N lies INSIDE this neighborhood converges to 100% as N goes to infinity.

EXERCISE: CONFIRM this law using EXCEL function BINOMDIST>

Suppose that S is Bi(N,p) with p=50%.

1> Calculate the P(40% < S/N < 60%) for N=10,20,40,80,160 -- How does this probability behave?

1'> Also calculate P(S/N < 40% and P(S/N > 60%) -- how do these probabilities behave?

2> Calculate the P(49%< S/N < 51%) for differing values of N. Also calculate P(S/N < 49%) and P(S/N >51%)

2'> About how large should N be so that we can feel confident that S/N is within 1% of the true value 50%?