The beta distribution allows for various different shapes, using two shape parameters:
α shape parameter
β shape parameter
Different combinations can even mimic other distributions. The distributions curve results in outcomes within zero to one. It is a good model for processes that have a specific minimum and/or maximum value(s). Doing so requires scaling the outcome values by the desired minimum and maximum.
The beta distribution is a staple of project planning/scheduling models. For example, a project may be known to require on average fifteen days to complete but has been scheduled a maximum of twenty days. Depending on the worker, that task could be completed in fewer/more days than the average time. Additionally, it is very unlikely that the amount of work required is a small number of days.
The plot above shows 1000 samples, where the outcomes are scaled to a range of zero to twenty, and the distribution curve with parameters values : α = 7, β = 2. Notice that, given these specific parameter values, smaller value outcomes (ex. x < 5) are very unlikely to occur and larger values (approximately in the range of 13 to 20) are very likely to occur.
To illustrate the effect of the α and β parameters, (still scaled to be within the range of zero to twenty,) the following plots show the changes in shaped due to varying the values. Recall that the above example curve, where α = 7, and β = 2, is negatively skewed. The plots shown here have integer values for α, and β. Other shapes, ex. for α, and β, less than one, are explored at this link.
Setting equivalent values for α, and β, results in a symmetric (not skewed) shape. Note, the median is now half of the scaler value, 20 / 2 = 10. And, note that there is an equal likelihood of the occurrence of the minimum and maximum values, zero and twenty.
Swapping the values of α, and β, the curve is now positively skewed (the likelihood of larger values is increased, compared to a symmetric distribution).
As was mentioned, the beta distribution is heavily used in scheduling. The following examples use the beta distribution to aid in decision making at a hypothetical business you work for.
Example A
You are the new Special Projects Manager. It is a newly created role that you are excited to start. Your first project is to plan the distribution of funds to the Party Planning Committees and Committees to Plan Parties at each of the 80 branches of your organization. You allot each event at most $130. You ponder, "What will be the impact if event costs are over or under my estimate?"
Assuming 12 events per year, you estimate the total annual cost of events to be :
80 branches * 12 events / branch * 130 dollars / event = $124.8k
Realizing that some events may exceed the cap, you infrequently make such an allowance. Some branches may have more than twelve events per year, and assume 1000 total events, resulting an allotment of $130k. You make several fancy plots.
You want to model the ideal scenario where everyone adheres to the imposed limit. Using the beta distribution, you plot a beta distribution where α = 130, β = 21, resulting in a total annual cost of $120.5k.
Because you are willing to be lenient, as was described above, you plot a beta distribution setting α = 130, β = 14. Approximately 8% of the samples resulted in a value exceeding 130, with a total annual cost of $126.4k.
You also consider the cases where your estimates are wrong, α = 130, β = 7, and very wrong, α = 130, β = 3, with resultant total annual event costs of $132.9k and , $136.8k respectively.
Using the beta distribution, you realize that the total annual costs can vary greatly, dependent on how strictly the maximum event budget is enforced.
Example B
The company has released you from your current position. You decide to start your own paper distribution company (poaching a few employees from your former employer). As it is a startup, your new business has low demand, but you think it is necessary to start modeling it. Within the past three weeks, demand has been at most ten reams per day.
The demand was assumed to be uniform (and symmetric). After learning about the beta distribution (because you read this article) you define a beta distribution, setting α = β = 1, that appropriately resembles this assumption.
But, upon a more scrupulous inspection, you notice that :
not all values, from zero to ten, are equally likely.
there is a slight bias to outcomes of value five.
there is still a significant number of occurrences of values approaching both zero and ten.
Believing that the values are more centrally located. You make a fancy plot where α = β = 2, but realize that such a distribution under-represents the minimum and maximum values.
Setting α and β to a value between one and two, you plot the beta distribution where α = β = 1.25, to find a distribution that more closely resembles what you observe in practice.
The beta distribution is helpful in defining distributions with known maximum and minimum values. It has various shapes to model a centrally, edge, left, or right biased distribution. The diversity of its shapes, based on the two parameters makes it a good general purpose distribution. Specifically, aiding in exploration until appropriate values can be determined, or a more appropriate distribution is determined. Here, several of these shapes were presented, as well as applications to a couple business cases.
YHWH, please continue to shape us into someone you can use in this world.