S2 Multinomial

Suppose X1, X2, ..., XN is a sequence of IID Random Variables with a general discrete finite distribution: for each i, X(i) takes values 1,2,...,M with probability p(1),...,p(M). These probabilities are PRE-EXPERIMENTAL concepts -- they exist only BEFORE the random variable is REALIZED. POST-EXPERIMENTALLY, an OUTCOME of X(i) occurs. The CENTRAL question of probability and statistics is to to study the relationship between the

PRE-EXPERIMENTAL probabilities and the POST-EXPERIMENTAL outcomes.

Before the experiment, all of the outcomes are possible. After the experiment, only one outcome occurs and becomes certain, while all others become impossible. Nonetheless, the probabilities do play a role in what happens. In the long run the post-experimental proportions in the observed sample become similar to the pre-experimental unobservable probabilities.

When there are only two possible outcomes, the observed outcomes are governed by the Binomial distribution. When there are many possible outcomes, the observed outcomes are governed by the Multinomial distribution. Given a sequence of observed values of X(1), X(2),..., X(N), how many outcomes of 1,2,...,M will we see? Each outcome should occur according to its probability. Let K(1) be the number of times that the outcome 1 occurs among the N observations X(1),X(2),...,X(N). Similarly, let K(i) be the number of times that the outcome i occurs among the N observations. The probability of seeing 1 K(1) times,2 K(2) times, ...,and M K(M) times in the N observations -- where K(1)+ K(2) + ... + K(N) = M, is given by the Multinomial Probability formula:

We will provide some exercises to develop understanding of how this formula works.