Gaussian

What It Is

The Gaussian distribution, also known as the Normal distribution is continuous. It considers all real numbers used to characterize the probability of natural events occurring. It assumes that a large collection of values will likely center around a population mean, with some level of dispersion or spread (central limit theorem). These two properties are defined by the parameters :

μ mean or average

𝜎 standard deviation

The value of μ defines the most likely outcome(s). The standard deviation informs the range of possible outcomes. The range μ ± 4 *𝜎 covers 99.9936% of the possible outcomes. (Also read about 6𝜎.) Theoretically, all values are possible, though decreasingly likely the farther the value is from the mean.

Also noteworthy, this distribution requires a minimum sample size of about thirty observations. The more, the better. Fewer than thirty may result in an inaccurate representation of the actual population. It could incorrectly represent the mean or standard deviation. There are other distributions that consider small sample sizes, i.e. Student's T distribution.

Classically, the bean machine/Galton box, a physical device, has been used to illustrate the derivation of the Gaussian curve (and others). In such an experiment, each "bean" or bead is similar to what has been defined in other articles here as a trial. The above plot is of 1000 random trials, where parameters are μ = 10 and 𝜎 = 2. Notice that the most likely outcomes are between nine and eleven, wherein lies the value μ. Also, notice that most outcomes are within the range of 10 - (4 * 2) = 2 and 10 + (4 * 2) = 18.

Variants

The following plots of 1000 trials show modifications to the above example distribution, varying the value of μ and 𝜎 respectively.

μ = 2, 𝜎 = 2

The mean of this distribution, μ = 2, is less than the first example, resulting in a shift to the left. Now, there is a significant possibility of negative value outcomes. The range remains the same due to equivalent values of 𝜎.

μ = 10, 𝜎 = 4

The range of likely outcomes is larger due to a doubling of the value of 𝜎. Now, though the most probable outcomes remain near μ = 10, it appears there is a higher possibility of negative value outcomes. The result range is:

low 10 - (4 * 4) = -6

high 10 + (4 * 4) = 26

In Real Life (IRL)

The Gaussian/Normal distribution is appropriate for measurements of natural occurrences. Teachers often use such a distribution to "curve" students grades. Similarly, performance reviews of a manager's team may be aggregated and fit to such a curve to "encourage" or "motivate" team members.

The variation and range of measurements, specifically of the physical bodies, can be sufficiently characterized by the Gaussian distribution. You may have likely seen variations of such a chart concerning a patients height-weight ratio, or numerous other body part measurements.

Example A

You have returned to the position of regional manager of a mid-tier paper supplier. You must conduct the annual performance reviews for all of your forty staff members. The company regulations state that all staff must fall within a gradient of "excellent", "high", "moderate", and "low" performance, relating directly to bonus payouts. Knowing all of your employees perform well, and ensured profitability this year, you regret having to curve their review scores.

After the reviews, you gather the scores:

93.5, 91.3, 93.2, 89.9, 95.4, 95.2, 93.5, 94. , 96.6, 97.4, 94. , 95.3, 94.4, 94.4, 93.5, 92.5, 95.9, 96.5, 92.8, 94.7, 95.3, 93.3, 95.9, 95.3, 91.6, 96.6, 93.2, 97.1, 92.6, 94.5, 98.3, 95.3, 96.6, 93.9, 92.2, 91.6, 95.7, 93.1, 90.6, 95.5

You see that you manage an excellent team, and make a fancy plot.

You specify your teams performance levels as such:

score ≥ 96 : "excellent"
94 ≤ score < 96 : "high"
92 ≤ score < 94 : "moderate"
90 ≤ score < 92 : "low"
score < 90 : "Can I see you in my office?"

You distribute the bonus payments scaled by some factor related to the above criteria. Then you have to have a talk with employee number four...

μ = 94.3, 𝜎 = 1.9

Example B

Corporate sets up a branch-vs-branch weight loss competition. You believe your employees are generally a fit group of individuals. And, since the goal is cumulative weight loss, you think it would be difficult for your branch to be competitive (especially compared to the other, 'healthier', northeastern branch). You reason that if the average weight of your branch is about 190, it could be competitive in this weight loss challenge.

Not concerned with the ethics of doing so, you collect the weights of your forty employees:

148., 137., 146., 129., 157., 156., 148., 150., 163., 167., 150., 157., 152., 152., 147., 143., 160., 163., 144., 153., 157., 147., 159., 156., 138., 163., 146., 166., 143., 153., 171., 157., 163., 149., 141., 138., 158., 145., 133., 158.

You make a fancy plot.

Because each person's weight is independent of each other (for example: Jeff's excessive sandwich eating does not effect Alice's marathon training), the collection of everyone's weight fits a Gaussian distribution. The office's average weight, the value of μ, is determined to be about 151 pounds.

It is decided that this competition may not be beneficial for your branch. If you decided to participate, your fit employees may result to drastic weight loss methods, and crazy diets.

μ = 151.5, 𝜎 = 9.5

Conclusion

The Gaussian aka Normal distribution has been used to characterize the values of a large number of natural measures. Recall that it is important that it be a sizeable data set. Then, using only two, easy to calculate, parameters, a curve can be produced to represent the location and spread of your data.

Here, it has been shown how managers can use this distribution to aid in common business operations. Applying a curve, or characterizing a data set, the Gaussian distribution is a good general purpose model for naturally occurring values.

YHWH, We see your design in all natural processes. You have put standards and limits for all things. Thank you for centering us in you.

Page updated

Google Sites

Report abuse