Like the Bernoulli distribution, the binomial distribution describes the two-state, pass/fail outcomes but within a set number of successive trials, also called a Bernoulli process. Here, the term "process" will be used to define the set of successive trials. Define the parameters for probability and number of trials:
p probability of success
n number of trials in a process
Classically, this is described using a bag with two colors of balls where a ball is randomly chosen and returned to the bag. For example, assume there are four red balls, and six green balls, and red balls are desired, then p = 0.4. If ten selections (with replacement) are conducted , what then is the frequency of selecting a red ball N number of times?
The above plot illustrates 1000 processes of ten random trials (blue boxes) and the expected number of occurrences (red dots). That is to say, that this plot represents 10,000 random selections (with replacement). The question posed is how often will the number of success be some specified outcome value, N. And, as expected, because there are 4/10 red balls, within ten random trials the most frequently occurring number of successes is four.
Note: It is very unlikely that over ten random selections that zero red balls are selected or that all red balls are selected.
To illustrate the effect on the resultant distribution while varying values of p, consider the previously mentioned ten ball process with varying amounts of red balls. The likelihood a given number of successes is directly related to the number of probability of success, here, the number of red balls.
The following plots are derived by completing that 1000 processes
📑 Note: In both cases, the values farthest from n x p are virtually impossible to obtain. For example, given only one red ball, p = 0.1, it is very unlikely in a process of ten trials to randomly select (with replacement) ten red balls.
📑 Note: The values for n and p were chosen for ease of explanation. (Bonus)
Here, there is one red and nine green balls; set p = 1 / 10 = 0.1. It is most likely that for a process (of ten trials) only one red ball would be selected, and also very likely to select zero red balls.
This experiment has eight red and one green balls; set p = 9 / 10 = 0.9. Again, see that in a process of ten trials, it is mostly likely to select nine red balls.
Similar to the Bernoulli distribution, the binomial distribution can only be applied to pass/fail outcomes of a single trial. They differ in that the binomial considers the resultant sum of several Bernoulli trials. And, the important aspect of this distribution is that a selection is made and the selected item is returned.
Example A
You have been hired by a north-eastern paper distributor as an IT professional. You are tasked with deploying a set of security challenges for login requests. The boss recalls that the twelve associates average thirty logins per day. He desires that less than 35% of user logins be presented with a security question; too many requests would upset the associates.
The binomial distribution helps you determine the rate of challenges to have so that this requirement is met. Assuming the average actually characterizes the number of logins with minimal variation, you define the process to be a day, n = 30. Next, you plot varying values of p :
You tell the boss, "If we define a 35% challenge rate, it is likely that many logins on most days will be presented with a challenge."
You tell the boss, "If we define a 25% challenge rate, then a large number of days would meets your requirement of challenging less than 35% of logins, but some, more."
You tell the boss, "If we define a 10% challenge rate, your requirement is met. However, there may be days where no logins are challenged."
Example B
You recently become VP of Northeastern sales managing on average 200 daily transactions. Hoping to increase sales you implement a web platform to supplement the sales of the core business. In the process of rolling out the website and generating usage you fraudulently duplicate about thirty salesperson sales for web sales, daily.
Now the Security Exchanges Commission (SEC) is investigating the company. It can detect fraudulent behavior consisting of at least 14% duplicated entries. Is it likely that the SEC will discover the fraudulently entered sales? How few entries should be duplicated to almost ensure that the fraudulent sales will not be detected?
You realize that duplicating thirty of 200 sales results in a total number of sales records, n = 200 + 30 and a probability, p = 30 / (200 + 30) = 0.13. You plot it. Then you choose a lower rate of fraud, p = 14 / (200 + 14) = 0.065. You plot it.
Given these parameters, the SEC would detect fraud if 0.14 * 230 = 32 fraudulent entries are present. Though the most likely outcome, finding 30 duplicates, is less than the SEC threshold, it is possible that your fraudulent activity will be detected.
Given these parameters, the SEC would detect fraud if 0.14 * 214 = 29.96 ~ 30 fraudulent entries are present. The selected duplication rate results in a virtually undetectable fraud operation. Congrats!
The conclusions from the examples are intuitive, right? This binomial distribution also aids in understanding the likelihood of obtaining other values other than the mostly likely. Of interest may be what values are (very) difficult or virtually impossible to obtain. Such a value could aid in defining a threshold for service.
Here, I hope that this provided some insight into an interesting representation of a simple process. However, often it is the case in a process that lot sampling/testing requires selection without replacement. The article on the hypergeometric distribution will help you become better acquainted with such a case.
YHWH, though we may go back multiple times, replacing and re-selecting things that may not be good, we thank you for making a way to choose contrary to the statistics.