1. Concepts & Definitions
1.1. Defining statistical test of hypothesis
1.2. Numerical example of test of hypothesis for mean
1.3. Code for test of hypothesis for mean
1.4. Code for right tailed test of hypothesis for mean
1.5. Code for left tailed test of hypothesis for mean
1.6. Code for small sample hypothesis for mean
1.7. P-Value and test of hypothesis
1.8. Statistical power and power analysis
1.9. Shapiro Wilk for normality test
2. Problem & Solution
2.1. Shapiro Wilk to verify CLT Simulator
The next code helps to understand each sample size CLT will hold. In this case, the code created on Track 07 - section 1.3 will be employed again.
First, let's reload the CLT Simulator code available in the following link:
https://colab.research.google.com/drive/1xlPjna5F0L2hBNJjyapA-zF3Pu5vF6N7?usp=sharing
Additionally, the code about how to employ a Shapiro-Wilk is available in the following link:
https://colab.research.google.com/drive/1dLk4yeCrSnqs9ok2CoGnhJGAAb9QVzfz?usp=sharing
Observing the commands made in the last code, a new cell with code could be modified to consider the Shapiro-Wilk test.
import matplotlib.pyplot as plt
from scipy.stats import shapiro
# plotting all the means in one figure
k=0
fig, ax = plt.subplots(2, 2, figsize =(8, 8))
for i in range(0, 2):
for j in range(0, 2):
# Histogram for each x stored in means
ax[i, j].hist(list_means_samples[k], 20, density = True)
ax[i, j].set_title(label = 'Sample size = '+str(list_sample_size[k]))
stat,p = shapiro(list_means_samples[k])
print("The Test-Statistic and p-value are as follows:\nTest-Statistic = %.3f , p-value = %.3f"%(stat,p))
k = k + 1
plt.show()
The Test-Statistic and p-value are as follows: Test-Statistic = 0.957 , p-value = 0.000 The Test-Statistic and p-value are as follows: Test-Statistic = 0.996 , p-value = 0.006 The Test-Statistic and p-value are as follows: Test-Statistic = 0.996 , p-value = 0.015 The Test-Statistic and p-value are as follows: Test-Statistic = 0.998 , p-value = 0.200
According to Shapiro-Wilk test results, the sample size should be at least a size equal to 10 to hold the results of the Central Limit Theorem about normally distributed data.
The complete code is available in the following link:
https://colab.research.google.com/drive/1y6uuaB8kZVaEmEl6nTcQ6HCFJL-4sTNI?usp=sharing