1. Concepts & Definitions
1.1. A Review on Parametric Statistics
1.2. Parametric tests for Hypothesis Testing
1.3. Parametric vs. Non-Parametric Test
1.4. One sample z-test and their relation with two-sample z-test
1.5. One sample t-test and their relation with two-sample t-test
1.6. Welch's two-sample t-test: two populations with different variances
1.7. Non-Parametric test for Hypothesis Testing: Mann-Whitney U Test
1.8. Non-Parametric test for Hypothesis Testing: Wilcoxon Sign-Rank Test
1.9. Non-Parametric test for Hypothesis Testing: Wilcoxon Sign Test
1.10. Non-Parametric test for Hypothesis Testing: Chi-Square Goodness-of-Fit
1.11. Non-Parametric test for Hypothesis Testing: Kolmogorov-Smirnov
1.12. Non-Parametric for comparing machine learning
2. Problem & Solution
2.1. Using Wilcoxon Sign Test to compare clustering methods
2.2. Using Wilcoxon Sign-Rank Test to compare clustering methods
2.3. What is A/B testing and how to combine with hypothesis testing?
2.4. Using Chi-Square fit to check if Benford-Law holds or not
2.5. Using Kolmogorov-Smirnov fit to check if Pareto principle holds or not
A summary of hypothesis test
For a more detailed information, please see content at Track 08 - Section 1.2
The average service time of a company in 2018 was 12.44 minutes. Management wants to know whether the arithmetic mean current is different from 12.44 minutes. A sample with 150 values had an arithmetic mean of 13.71 minutes and a standard deviation of 2.65 minutes. Using α = 5%, can you conclude whether the time is currently different?
Before trying to solve this problem is interesting to have a close look at the steps defined by the flowchart to create a test hypothesis.
The average service time of a company in 2018 was 12.44 minutes. Management wants to know whether the arithmetic mean current is different from 12.44 minutes. A sample with 150 values had an arithmetic mean of 13.71 minutes and a standard deviation of 2.65 minutes. Using α = 5%, can you conclude whether the time is currently different?
Let's recall the development to solve the previously numerical example presented.
Null hypothesis (Ho): The mean had been not affected, i.e., μ = 12.44.
Alternate hypothesis (Ha): Then the mean had been affected, i.e., μ ≠ 12.44.
From the table described in the step to choose a statistical test, the sign of the hypotheses Ho and Ha point that a two-tailed test should be carried.
Since the sample is larger than 30, then a normal distribution could be employed.
The next table helps to understand the relation between confidence level, alpha (α), and the critical value z α/2.
To compute a statistical test is necessary to convert the observed value in the mean of the sample (x̄) to the scale of a standard normal distribution (Zobs). These could be done using the following equation:
zobs = (x̄ - μ)/(s/(n^0.5))
This equation will result in the following numbers:
zobs = (13.71 - 12.44)/(2.65/(150^0.5)) = (13.71-12.44)/0.2164 = 5.87
Since Zobs = 5.87 is higher than upper critical value z α/2 = 1.96, then we can reject the Null hypothesis.
Significance Level and P-Value
For a more detailed information, please see content at Track 08 - Section 1.7
On the probability distribution plot, the significance level defines how far the sample value must be from the null value before we can reject the null. The percentage of the area under the curve that is shaded equals the probability that the sample value will fall in those regions if the null hypothesis is correct. To represent a significance level of 0.05, the next figure shade in a red color α = 5% of the distribution furthest from the null value [1].
The first interpretation is P-values indicate that the null hypothesis is correct given the sample data. They gauge how consistent your sample statistics are with the null hypothesis. Specifically, if the null hypothesis is right, what is the probability of obtaining an effect at least as large as the one in your sample [1]?
High P-values: Your sample results are consistent with a true null hypothesis.
Low P-values: Your sample results are not consistent with a null hypothesis.
If your P-value is small enough, you can conclude that your sample is so incompatible with the null hypothesis that you can reject the null for the entire population. As a more clear decision rule:
If P-value ≥ α then cannot reject the null hypothesis
Else reject the null hypothesis
In a second interpretation, P-values tell you how consistent your sample data are with a true null hypothesis.
Suppose the hypothesis test generates a P-value of 0.03. You’d interpret this P-value as follows: although the null hypothesis holds on the population as a whole, 3% of individuals will obtain the effect observed in your sample, or larger, because of random sample error[2].
However, when your data are very inconsistent with the null hypothesis, P values can’t determine which of the following two possibilities is more probable:
The null hypothesis is true, but your sample is unusual due to random sampling error.
The null hypothesis is false.
This is the main reason why it is said that: “failed to reject the null hypothesis” or "cannot reject the null hypothesis" rather than accepted it.
First, let's compute Zcrit for the confidence level equal to 1%.
from scipy.stats import norm
muz = 0
sigmaz = 1
p = 0.99
alfa = 1 - p
pr = 1 - alfa/2
z = norm.ppf(pr,muz,sigmaz)
print("Given alpha = ",str(round(alfa,2)),", Zcrit = ",round(z,2))
Given alpha = 0.01 , Zcrit = 2.58
Now, use the sample data to compute Zobs.
mi = 12.44
H0 = "The value is equal to " + str(mi)
n = 150
xb = 12.98
s = 2.65
sx = s/(n**(0.5))
zobs = (xb - mi)/sx
print("Zobs = ",zobs,"and Zcrit = ",z)
if (zobs > z)|(-zobs < -z): # Zobs belongs to the critical region
print("Reject H0: ",H0)
else:
print("Do not reject H0: ",H0)
Zobs = 2.495706530382865 and Zcrit = 2.5758293035489004
Do not Reject H0: The value is equal to 12.44
To better visualize this situation, let's draw it.
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
x1 = np.arange(-z, z, 0.001) # range of x in spec
x_all = np.arange(-10, 10, 0.001) # entire range of x, both in and out of spec
# mean = 0, stddev = 1, since Z-transform was calculated
y1 = norm.pdf(x1,0,1)
y_all = norm.pdf(x_all,0,1)
# build the plot
fig, ax = plt.subplots(figsize=(9,6))
plt.style.use('fivethirtyeight')
ax.plot(x_all,y_all)
ax.text(muz, 0.4, 'Ho', fontsize=14)
ax.text(muz-1.2*z, 0.1, 'Ha', fontsize=14)
ax.text(muz+z, 0.1, 'Ha', fontsize=14)
ax.fill_between(x1,y1,0, alpha=0.7, color='g')
ax.fill_between(x_all,y_all,0, alpha=0.1, color='r')
ax.set_xlim([-6,6])
ax.set_xlabel('# of Standard Deviations Outside the Mean')
ax.set_yticklabels([])
ax.set_title('Normal Gaussian Curve CL = '+str(round(p*100,0))+' %')
# drawing Zobs
ax.axvline(x = zobs, color = 'r', label = 'Zobs')
ax.text(0.85*zobs, 0.2, 'Zobs', fontsize=14)
print("Zobs: ",zobs)
Let's first compute P-Value and use the chosen confidence level to make a decision.
muz = 0
sigmaz = 1
P = norm.cdf(zobs,muz,sigmaz)
Pvalue = 2*(1-P)
print("Pvalue = ",Pvalue)
print("alpha = ",alfa)
if (Pvalue >= alfa):
print("We cannot reject H0")
else:
print("Reject H0")
Pvalue = 0.012570655321079371
alpha = 0.010000000000000009
We cannot reject H0
Finally, let's draw in a graphic the P-value metric.
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
x1 = np.arange(-z, z, 0.001) # range of x in spec
x_all = np.arange(-10, 10, 0.001) # entire range of x, both in and out of spec
# mean = 0, stddev = 1, since Z-transform was calculated
y1 = norm.pdf(x1,0,1)
y_all = norm.pdf(x_all,0,1)
# build the plot
fig, ax = plt.subplots(figsize=(9,6))
plt.style.use('fivethirtyeight')
ax.plot(x_all,y_all)
ax.text(muz, 0.4, 'Ho', fontsize=14)
stralfa = str(round(alfa*100,2))
ax.text(muz-1.2*z, 0.1, 'Ha|alpha = ' + stralfa, fontsize=14)
ax.text(muz+z, 0.1, 'Ha|alpha = ' + stralfa, fontsize=14)
ax.fill_between(x1,y1,0, alpha=0.7, color='g')
ax.fill_between(x_all,y_all,0, alpha=0.1, color='r')
ax.set_xlim([-6,6])
ax.set_xlabel('# of Standard Deviations Outside the Mean')
ax.set_yticklabels([])
ax.set_title('Normal Gaussian Curve CL = '+str(round(p*100,0))+' %')
# drawing Zobs
ax.axvline(x = zobs, color = 'r', label = 'P-Value')
ax.text(0.85*zobs, 0.2, 'P-value = '+str(round(Pvalue*100,2)), fontsize=14)
print('P-value',Pvalue)
P-value 0.012570655321079371
The previous complete code is available in the following link:
https://colab.research.google.com/drive/1P7DRHjbNfrVRrJqe2RL4w4uuvfgW5Osm?usp=sharing