1 EXCEL Calculations

BInomial Probabilities can easily be calculated using built in EXCEL functions for this purpose. We illustrate this with examples.

INFORMATION: Gallup Pakistan carried out a survey by taking a simple random sample of 1842 men and women in rural and urban areas of all four provinces of the country, during May 23 - May 30, 2016. They found that 68% of their sample (1253) preferred Brains to Beauty in their marriage partner. They claim that the error margin in their survey is around 2 to 2% with 95% confidence. We will learn the full meaning of this last sentence later in this course.

ASSUMPTIONS: To make Binomial calculations, we need to know the Population, and the true probability within the population.

Assume the total population of Pakistan is 200 Million. Assume that 65% of the people (130 Million) in the total population prefer Brains to Beauty.

QUESTION: Under these assumptions, what is the probability that a Simple Random Sample of 1842 people will contain 1253 people who prefer Brains to Beauty.

ANSWER: This is a straightforward problem in Binomial Probability. Let X be a Binomial Random Variable with N=1842, and p=65%. P(X=1253) can be computed by the Binomial formula:

1842! 1253 589

P(X=1253) = -------------- [0.65] x [0.35]

1253! x 589!

While the formula is easy to write down, it is very hard to compute, because large factorials and high powers even now exceed the range of numbers allowed for in computers. However, EXCEL has a built in formula to compute Binomial Probabilities.

=BINOMDIST(1253,1842,0.65,false) **** This formula computes P(X=1253) when X is Bi(1842,65%)

Entering this into an EXCEL spreadsheet yields the following answer:

P(X=1253)=0.000467

EXCEL computes two types of Binomial probabilities. One is the P(X=k). The other is P(X<=K), which is also called the Cumulative Probability -- The probability of all outcomes UPTO and INCLUDING K. This is obtained by setting the last argument to true instead of false:

P(X<=1253)=BINOMDIST(1253,1842,0.65,true)=0.997=99.7%

Note that this means that P(X>1253)=0.3%, or 3/1000, which is a VERY SMALL probability. This event is very unlikely. Dividing X by N=1842 gives the SAMPLE PROPORTION, while 1253/1842 is just 68%. So we see that:

P( X/N > 68%) = 0.3% or 3/1000.

This has a very important interpretation, which should be understood clearly. X is a random variable with distribution Bi(1842,65%). The proportion of 1's in the population is 65%. The Binomial calculations show that in a sample of size 1842, the observed proportion X/N is very unlikely to be higher than 68%. This shows that the sample proportion cannot deviate by more than 3% on the upside from 65%. What about the downside? We can make a similar calculation:

P(X/N <= 62%) = P( X <= 62% x 1842) = P(X <= 1142) = BINOMDIST(1142,1842,65%,true) = 0.000387

So this probability is also very low. We can conclude that in a random sample of 1842 people, the observed proportion of people who prefer beauty to brains will lie within 62% and 68% with very high probability:

P(62% < X/N <= 68%) = P(X/N <= 68%) - P(X/N <= 62%) = 0.99713 - 0.000387 = 0.9933 = 99.33%

That is, we can have 99% confidence that the observed sample proportion does not deviate by more than 3% from the true population probability of 65%.

*****************PART BELOW IS TOO DIFFICULT AT THIS STAGE -- OMIT (meant for inclusion LATER******************

QUESTION: Our population consists of 130,000 registered voters. From these, we take a simple random sample consisting of 1000 voters, and ask them which party they will vote for. We find that 450 Voters in our random sample say that they will vote for PML-N while 550 mention other parties. What is the maximum possible proportion p of voters for PML-N in the full population of 130,000 voters?

NOTE & WARNING: In statistics, all questions and answer have a certain amount of uncertainty attached to them. The question asked above CANNOT be answered with 100% confidence. BUT we can find a REASONABLE answer. That is, we can find a maximum probability p* such that it is HIGHLY UNLIKELY that the true probability p is higher than p*.

ANSWER: Suppose the TRUE PROBABILITY is p -- that is p=N/130,000, where N is the number of voters for PML-N in the population. This N is not known to us -- if the N was known, then the QUESTION does not make sense. When we know the true value p than p is the maximum value and also the minimum value and also the only value.

Now we PRETEND that p is known to us. Suppose that p is 50% for example. Consider a THOUGHT EXPERIMENT -- we plan to take a random sample of 1000 voters, and count the number of PML-N voters -- this is actually the experiment that was carried out, but that experiment is now finished and in POST-EXPERIMENTAL situation. So now we think about an NEW experiment which we plan to do, so as to get into the PRE-EXPERIMENTAL mode. LET X be the random variable which will be the number of people in our NEW random sample of 1000 who will say that they plan to vote for PML-N. Because this new experiment has not been carried out, X is a Binomial RV: Bi(1000,50%). We can ask what is the P(X=450)? According to the Binomial Probability formula, this will be 1000 choose 450 multiplied by 0.5 raised to the 1000th power. This is rather hard to calculate, because the numbers involved are very large. However, EXCEL has a built in function which calculates this:

BINOMDIST(450,1000,50%,FALSE) calculates the desired probability: P(X=450) when X is Bi(1000,50%). Enter this into an EXCEL spreadsheet to find the answer: