2.3. Hypergeometric inspections

1. Concepts & Definitions

1.1. Example of random variables

1.2. Probability of events to random variables

1.3. PMF versus CDF

1.4. Discrete uniform distribution of probability

1.5. Bernoulli distribution of probability

1.6. Binomial distribution of probability

1.7. Hypergeometrical distribution of probability

1.8. Poisson distribution

2. Problem & Solution

2.1. Find fraud as a sum of Bernoulli: Binomial

2.2. Maps and probabilities

2.3. Hypergeometric inspections

What is the impact of changing the sample size in inspections?

Imagine an inspection process for a container with a problem. Typically, this process can be modeled as a process of finding the problem (success) or not (failure) on each try. Also, once a container is selected, it is not returned to the initial container quota. This is called process without a replacement and is illustrated in the next Figure with N = 10 (population size), K = 5 (number of success in population), n = 4 (sample size), x = 2 (number of success - red containers found) .

This process is best modeled using a hypergeometric distribution. The following question could be formulated:

What is the probability of finding at least one container with a problem given that the sample has size n?

To properly answer this, it means find P(X >=1 ) = P(X = 1) + P(X = 2) + ...

or since P(X = 0) + P(X >= 1) = 1 -> P(X >= 1) = 1 - P(X = 0).

How does P(X >= 1) probability change as n increases?

To answer the previous question, the Hypergeometric distribution can be used with the following parameters: with N = 1000 (population size), K = 5 (number of success in population), n = 2 (sample size), and compute:

• Happen exactly zero failures in inspection: P(X = 0).

• More than zero failures in inspection: P(X >= 1) = 1 - P(X = 0).

What happens with an increase in the sample size ranging from 0 to 200?

Instead of employing completed equations, the answers to these two questions will be addressed through Python code.

Applying code to find P(X = 0) and P(X >= 1)

The following code shows the probability of the numerical example employing PMF of Hypergeometric distribution using hypergeom.pmf command to find P(X = 0).

from scipy.stats import hypergeom

import matplotlib.pyplot as plt

import numpy as np

# computing P(X) for each X.

x = [0]

N = 1000

K = 5

n = 2

y = hypergeom(N, K, n).pmf(x)

print(' X |',x)

print('P(X) |',y)

X | [0]

P(X) | [0.99002002]

The sensibility to find at least one issue by changing the sample size

The following code shows the probability of finding at least one issue depending on the sample size and showing the corresponding graphic.

from scipy.stats import hypergeom

import matplotlib.pyplot as plt

import numpy as np

# computing P(X) for each X.

x = [0]

N = 1000

K = 5

ns = list(range(0,201))

y = []

for n in ns:

res = hypergeom(N, K, n).pmf(x)

y.append(1-res)

plt.plot(ns,y,'-r')

plt.xlabel('n (Sample Size)')

plt.ylabel('Probability find issue (P(X>=1))')

plt.grid()

plt.show()

The previous complete code is available in the following link:

https://colab.research.google.com/drive/1hBvNNszy1Lq8fOYmQ9uh-SOnbjzC--Rm?usp=sharing

Page updated

Google Sites

Report abuse