The objective of this activity is to model a dataset with a discrete random variable and verify its fit to a theoretical distribution, specifically the Poisson distribution.
The dataset consists of the number of complaints received per day by a company over 2000 days.
Dataset :- Sample_Data
Empirical Probability Calculation
The empirical probability P(X = k) is calculated as:
P(X = k) = Number of days with k complaints / Total number of days analyzed
This probability is computed for each X and recorded in the dataset.
Checking the Poisson Distribution Fit
The Poisson distribution models the probability of a given number of complaints occurring per day, given a known average rate (λ).
The average rate (λ) is given by:
λ = Total number of complaints / Total number of days analyzed
To find the total number of complaints, multiply the number of complaints by their respective frequencies:
0×150 +1×310 + 2×420 + 3×350 + 4×280 + 5×200 + 6×130 + 7×70 + 8×40 + 9×25 + 10×15 + 11×5 + 12×5
= 0 + 310 + 840 + 1050 + 1120 + 1000 + 780 + 490 + 320 + 225 + 150 + 55 + 60
= 6400
Thus,
λ = 6400 / 2000
= 3.225
The probability mass function (PMF) for the Poisson distribution is:
𝑃(𝑋 = 𝑘) = (e-λ ⋅ λk) / k!
for k = 0, 1, 2, ...
Where:
e is the base of the natural logarithm (approximately 2.71828).
λ is the average rate (3.225).
k is the number of complaints.
Using this formula, the Poisson probabilities P(X = k) are calculated for comparison with the empirical probabilities.
By comparing the empirical probabilities with the Poisson probabilities, we verify that the number of complaints received per day follows a Poisson distribution with λ = 3.225.
The objective of this activity is to model a dataset using a discrete random variable and verify its fit to a Poisson distribution. The dataset consists of the number of complaints received per day by a company over 2000 days. The empirical probability of each complaint count is calculated, and the Poisson distribution is used to model the data with an average rate λ = 3.225, derived from the total complaints (6400) divided by the total days. The Poisson probabilities are computed using the PMF formula, and a comparison with empirical probabilities confirms that the data closely follows a Poisson distribution. Visualization is used to illustrate the distribution fit.