1. Concepts & Definitions
1.1. Continous random distribution of probability
1.2. Normal distribution of probability
1.3. Standard normal distribution of probability
1.4. Inverse standard normal distribution
1.6. Inverse Student's T distribution
2. Problem & Solution
2.1. Weight, dimension, and value per HS6
2.2. How to fit a distribution
2.3. Employing standard deviation
2.4. Total time spent in a system
Load the notebook with commands developed in step 2.1. (click on the link):
https://colab.research.google.com/drive/1Xo-2dWDgL-gmDJH3QmB6b4YMlntgQqtu?usp=sharing
2. Remember from previous section, the graph obtained:
A very useful library is one that automatically searches over a certain range of probability distribution and tries to find the one that best fits the data. First, let's install it.
!pip install fitter
The following will appear:
Collecting fitter Downloading fitter-1.5.2.tar.gz (27 kB) Preparing metadata (setup.py) ... done ...Successfully built fitter Installing collected packages: fitter Successfully installed fitter-1.5.2
Now, it is possible to employ the command fit just after making a list with the distributions that should be tested.
from fitter import Fitter, get_common_distributions, get_distributions
f = Fitter(weight,
distributions=['gamma',
'lognorm',
"beta",
"burr",
"norm"])
f.fit()
f.summary()
The following results will appear:
Fitting 5 distributions: 100%|██████████| 5/5 [00:01<00:00, 4.33it/s]
Although the normal distribution seems to be the best-adjusted distribution, it seems that there are two peaks. This aspect will better tackle in the next sections.
The previous complete code is available in the following link:
https://colab.research.google.com/drive/1ZYjHH1edDAQWfTnUPqihirbDnPggLRMk?usp=sharing