1. Concepts & Definitions
1.1. Defining statistical test of hypothesis
1.2. Numerical example of test of hypothesis for mean
1.3. Code for test of hypothesis for mean
1.4. Code for right tailed test of hypothesis for mean
1.5. Code for left tailed test of hypothesis for mean
1.6. Code for small sample hypothesis for mean
1.7. P-Value and test of hypothesis
1.8. Statistical power and power analysis
1.9. Shapiro Wilk for normality test
2. Problem & Solution
2.1. Shapiro Wilk to verify CLT Simulator
The average service time of a company in 2018 was 12.44 minutes. Management wants to know whether the arithmetic mean current is different from 12.44 minutes. A sample with 150 values had an arithmetic mean of 11.71 minutes and a standard deviation of 2.65 minutes. Using α = 5%, can you conclude whether the time is lower?
Let's recall the development to solve the previously numerical example presented.
Null hypothesis (Ho): The mean is higher, i.e., μ ≥ 12.44.
Alternate hypothesis (Ha): Then the mean is lower, i.e., μ < 12.44.
From the table described in the step to choose a statistical test, the sign of the hypotheses Ho and Ha point that a Left-tailed test should be carried.
Since the sample is larger than 30, then a normal distribution could be employed.
The next table helps to understand the relation between confidence level, alpha (α), and the critical value z α (remember to employ negative values of z α for a left-tailed test).
To compute a statistical test is necessary to convert the observed value in the mean of the sample (x̄) to the scale of a standard normal distribution (Zobs). These could be done using the following equation:
zobs = (x̄ - μ)/(s/(n^0.5))
This equation will result in the following numbers:
zobs = (11.71 - 12.44)/(2.65/(150^0.5)) = (11.71-12.44)/0.2164 = -3.37
Since Zobs = -3.37 is higher than upper critical value -z α = -1.645, then we can reject the Null hypothesis.
The previous solution could be summarized by following Python code. First, let's compute the critical value Zcrit for a specified α.
from scipy.stats import norm
muz = 0
sigmaz = 1
p = 0.95
alfa = 1 - p
pr = 1 - alfa
z = -norm.ppf(pr,muz,sigmaz)
print("Given alpha = ",str(round(alfa,2)),", Zcrit = ",round(z,2))
Given alpha = 0.05 , Zcrit = -1.64
The next step is to compute the Zobs, and comparing it with Zcrit, and make a decision
mi = 12.44
H0 = "The value is equal to " + str(mi)
n = 150
xb = 11.71
s = 2.65
sx = s/(n**(0.5))
zobs = (xb - mi)/sx
print("Zobs = ",zobs,"and Zcrit = ",z)
if (zobs < z): # Zobs belongs to the critical region
print("Reject H0: ",H0)
else:
print("Do not reject H0: ",H0)
Zobs = -3.373825494776824 and Zcrit = -1.6448536269514722
Reject H0: The value is equal to 12.44
Finally, the next code help to visualize the critical regions of the test hypothesis with Zcrit, and the Zobs, help to compare their values to make a decision.
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
x1 = np.arange(-10, z, 0.001) # range of x in spec
x_all = np.arange(-10, 10, 0.001) # entire range of x, both in and out of spec
# mean = 0, stddev = 1, since Z-transform was calculated
y1 = norm.pdf(x1,0,1)
y_all = norm.pdf(x_all,0,1)
# build the plot
fig, ax = plt.subplots(figsize=(9,6))
plt.style.use('fivethirtyeight')
ax.plot(x_all,y_all)
ax.text(muz, 0.4, 'Ho', fontsize=14)
ax.text(muz-1.2*z, 0.1, 'Ha', fontsize=14)
ax.text(muz+z, 0.1, 'Ha', fontsize=14)
ax.fill_between(x1,y1,0, alpha=0.7, color='r')
ax.fill_between(x_all,y_all,0, alpha=0.1, color='g')
ax.set_xlim([-6,6])
ax.set_xlabel('# of Standard Deviations Outside the Mean')
ax.set_yticklabels([])
ax.set_title('Normal Gaussian Curve CL = '+str(round(p*100,0))+' %')
# drawing Zobs
ax.axvline(x = zobs, color = 'r', label = 'Zobs')
ax.text(0.85*zobs, 0.1, 'Zobs', fontsize=14)
The previous complete code is available in the following link:
https://colab.research.google.com/drive/1c3Ntev6skZ0ZCWbv3FUjZ9zZEzam4tKO?usp=sharing