1. Concepts & Definitions
1.1. A Review on Parametric Statistics
1.2. Parametric tests for Hypothesis Testing
1.3. Parametric vs. Non-Parametric Test
1.4. One sample z-test and their relation with two-sample z-test
1.5. One sample t-test and their relation with two-sample t-test
1.6. Welch's two-sample t-test: two populations with different variances
1.7. Non-Parametric test for Hypothesis Testing: Mann-Whitney U Test
1.8. Non-Parametric test for Hypothesis Testing: Wilcoxon Sign-Rank Test
1.9. Non-Parametric test for Hypothesis Testing: Wilcoxon Sign Test
1.10. Non-Parametric test for Hypothesis Testing: Chi-Square Goodness-of-Fit
1.11. Non-Parametric test for Hypothesis Testing: Kolmogorov-Smirnov
1.12. Non-Parametric for comparing machine learning
2. Problem & Solution
2.1. Using Wilcoxon Sign Test to compare clustering methods
2.2. Using Wilcoxon Sign-Rank Test to compare clustering methods
2.3. What is A/B testing and how to combine with hypothesis testing?
2.4. Using Chi-Square fit to check if Benford-Law holds or not
2.5. Using Kolmogorov-Smirnov fit to check if Pareto principle holds or not
What is Chi-Square Test?
The Chi-Square test is a statistical procedure for determining the difference between observed and expected data. This test can also be used to determine whether it correlates to the categorical variables in our data. It helps to find out whether a difference between two categorical variables is due to chance or a relationship between them [1].
There are two main types of Chi-Square tests [2]:
Chi-Square Goodness-of-Fit Use the goodness-of-fit test to decide whether a population with an unknown distribution “fits” a known distribution.
Use the test for independence to decide whether two variables (factors) are independent or dependent, i.e. whether these two variables have a significant association relationship between them or not .
Chi-Square Goodness-of-Fit Test - Step-by-Step Manual Example
A factory tracks the number of machinery breakdowns each day of the week. The data is as follows:
We want to determine if the breakdowns are uniformly distributed across the week using the following steps:
State the Hypotheses:
Null Hypothesis (H0): Breakdowns are uniformly distributed across the week.
Alternative Hypothesis (Ha): Breakdowns are not uniformly distributed across the week.
Calculate the Expected Frequency:
To find the expected frequency if the breakdowns are uniformly distributed, we sum the total number of breakdowns and divide by the number of days.
Total Breakdowns = 14 + 22 + 16 + 18 + 12 + 19 + 11 = 112
Expected Frequency per day = 112/7 = 16
Calculate the Chi-Square Statistic:
Where: Oi is the observed frequency, and Ei is the expected frequency.
χobs 2 = 0.25+2.25+0+0.25+1+0.5625+1.5625 = 5.875
4. Determine the Degrees of Freedom:
Degrees of freedom (df) = 𝑛−1, where: 𝑛 is the number of categories (days).
df = 7 − 1 = 6
5. Find the Critical Value:
At a 5% level of significance (α = 0.05) and 6 degrees of freedom, the critical value from the Chi-Square distribution table is approximately χcrit 2 = 12.592.
6. Compare the Chi-Square Statistic to the Critical Value:
The critical value of χcrit 2 will be computed to be employed as a criteria to accept or reject null hypothesis using:
Reject H0: χobs 2 >= χcrit 2
Do not reject H0: χobs 2 < χcrit 2
For the data of this problem:
χobs 2 = 5.875 < χcrit 2 =12.592
7. Conclusion:
Since the Chi-Square calculated value is less than the critical value, we accept the null hypothesis. We conclude that the breakdowns are uniformly distributed across the week.
Chi-Square Goodness-of-Fit Test - Step-by-Step Manual Example - Python Code
The next Python code shows how to make some manual calculations automatically, and also provides how to employ the command stats.chi2.ppf from scipy.stats library to avoid the use of critical values for Chi-Square distribution.
import scipy.stats as stats
# Observed frequencies
observed = [14, 22, 16, 18, 12, 19, 11]
# Expected frequencies (uniform distribution)
total_breakdowns = sum(observed)
expected = [total_breakdowns / 7] * 7
# Calculate Chi-Square statistic
chi_square_statistic, p_value = stats.chisquare(observed, expected)
# Degrees of freedom
df = len(observed) - 1
# Critical value at 5% significance level
critical_value = stats.chi2.ppf(0.95, df)
# Print the results
print(f"Chi-Square Statistic: {chi_square_statistic}")
print(f"Degrees of Freedom: {df}")
print(f"Critical Value: {critical_value}")
print(f"P-Value: {p_value}")
if chi_square_statistic < critical_value:
print("Accept the null hypothesis: Breakdowns are uniformly distributed.")
else:
print("Reject the null hypothesis: Breakdowns are not uniformly distributed.")
Chi-Square Statistic: 5.875
Degrees of Freedom: 6
Critical Value: 12.591587243743977
P-Value: 0.43733749366050445
Accept the null hypothesis: Breakdowns are uniformly distributed.
The Python code with the data,and detailed computation to apply Goodness-of-Fit Test is given at:
https://colab.research.google.com/drive/19WdSUSXTuuVL1bB1NhtG6xvqJrC2DPSS?usp=sharing
Chi-Square Independence of Test - Step-by-Step Manual Example
Use the test for independence to decide whether two variables (factors) are independent or dependent, i.e. whether these two variables have a significant association relationship between them or not. In this case, there will be two qualitative survey questions or experiments and a contingency table will be constructed. The goal is to see if the two variables are unrelated (independent) or related (dependent). The null and alternative hypotheses are:
Null Hypothesis (H0): The two variables (factors) are independent.
Alternative Hypothesis (Ha): The two variables (factors) are dependent.
Let’s take an example. Suppose, we want to investigate if gender and preferred color of shirt were independent. This means we want to find out if a person’s gender influences their color choice. We conducted a survey and organized the data in the following table of observed values [2].
State the Hypotheses:
Null Hypothesis (H0): Gender and preferred shirt color are independent.
Alternative Hypothesis (Ha): Gender and preferred shirt color are not independent.
Calculate the Expected Frequency:
For calculating Chi-squared test statistics we need to calculate the expected value. So, add all the rows and columns and overall totals:
After this we can calculate the expected value table from the previous table for each entry using the Equation (1) to obtain the Expected value Table:
Expected value = (row total * column total)/overall total (1)
For example, the expected value for Male and Black is computed using (150 x 82)/298 = 41.27 ≈ 41.3.
3. Calculate the Chi-Square Statistic:
Where: Oi is the observed frequency, and Ei is the expected frequency.
χobs 2 = 34.9572
4. Determine the Degrees of Freedom:
Degrees of freedom (df) = (number of rows − 1)*(number of columns − 1),
df = (2-1) * (4-1) = 3.
5. Find the Critical Value:
At a 5% level of significance (α = 0.05) and 3 degrees of freedom, the critical value from the Chi-Square distribution table is approximately χcrit 2 = 7.815.
6. Compare the Chi-Square Statistic to the Critical Value:
The critical value of χcrit 2 will be computed to be employed as criteria to accept or reject the null hypothesis using:
Reject H0: χobs 2 >= χcrit 2
Do not reject H0: χobs 2 < χcrit 2
For the data of this problem:
χobs 2 = 34.9572 >= χcrit 2 =7.815
7. Conclusion:
Since the Chi-Square calculated value is higher than the critical value, we reject the null hypothesis. We can conclude that gender and preferred shirt color are not independent.
Chi-Square Independence of Test - Step-by-Step Manual Example - Python using critical value
The next Python code shows how to make some manual calculations automatically, and also provides how to employ the command stats.chi2.ppf from scipy.stats library to avoid the use of critical values for Chi-Square distribution.
import pandas as pd
from scipy.stats import chi2_contingency
from scipy.stats import chi2
# Given dataset
df_dict = {
'Black': [48, 34],
'White': [12, 46],
'Red': [33, 42],
'Blue': [57, 26]
}
dataset_table = pd.DataFrame(df_dict, index=['Male', 'Female'])
print("Dataset Table:")
print(dataset_table)
print()
# Observed Values
Observed_Values = dataset_table.values
print("Observed Values:")
print(Observed_Values)
print()
# Perform chi-square test
val = chi2_contingency(dataset_table)
Expected_Values = val[3]
print("Expected Values:")
print(Expected_Values)
print()
# Degree of Freedom
no_of_rows = len(dataset_table.iloc[0:2, 0])
no_of_columns = 4
ddof = (no_of_rows - 1) * (no_of_columns - 1)
print("Degree of Freedom:", ddof)
print()
# Chi-square statistic
chi_square = sum([(o - e) ** 2. / e for o, e in zip(Observed_Values, Expected_Values)])
chi_square_statistic = chi_square[0] + chi_square[1]
print("Chi-square statistic:", chi_square_statistic)
print()
# Critical value
alpha = 0.05
critical_value = chi2.ppf(q=1-alpha, df=ddof)
print('Critical value:', critical_value)
print()
# Significance level
print('Significance level:', alpha)
print('Degree of Freedom:', ddof)
print()
# Hypothesis testing
if chi_square_statistic >= critical_value:
print("Reject H0, Gender and preferred shirt color are independent")
else:
print("Fail to reject H0, Gender and preferred shirt color are not independent")
print()
Dataset Table:
Black White Red Blue
Male 48 12 33 57
Female 34 46 42 26
Observed Values:
[[48 12 33 57]
[34 46 42 26]]
Expected Values:
[[41.27516779 29.19463087 37.75167785 41.77852349]
[40.72483221 28.80536913 37.24832215 41.22147651]]
Degree of Freedom: 3
Chi-square statistic: 22.597058622962738
Critical value: 7.814727903251179
Significance level: 0.05
Degree of Freedom: 3
Reject H0, Gender and preferred shirt color are independent
The Python code with the data,and detailed computation to apply Goodness-of-Fit Test is given at:
https://colab.research.google.com/drive/19WdSUSXTuuVL1bB1NhtG6xvqJrC2DPSS?usp=sharing
Chi-Square Independence of Test - Step-by-Step Manual Example - Python using P-Value
The next Python code shows how to make some manual calculations automatically, and also provides how to employ the command stats.chi2.ppf from scipy.stats library to avoid the use of P-Value for Chi-Square distribution.
import pandas as pd
from scipy.stats import chi2_contingency
from scipy.stats import chi2
# Given dataset
df_dict = {
'Black': [48, 34],
'White': [12, 46],
'Red': [33, 42],
'Blue': [57, 26]
}
dataset_table = pd.DataFrame(df_dict, index=['Male', 'Female'])
print("Dataset Table:")
print(dataset_table)
print()
# Observed Values
Observed_Values = dataset_table.values
print("Observed Values:")
print(Observed_Values)
print()
# Perform chi-square test
val = chi2_contingency(dataset_table)
Expected_Values = val[3]
print("Expected Values:")
print(Expected_Values)
print()
# Degree of Freedom
no_of_rows = len(dataset_table.iloc[0:2, 0])
no_of_columns = 4
ddof = (no_of_rows - 1) * (no_of_columns - 1)
print("Degree of Freedom:", ddof)
print()
# Chi-square statistic
chi_square = sum([(o - e) ** 2. / e for o, e in zip(Observed_Values, Expected_Values)])
chi_square_statistic = chi_square[0] + chi_square[1]
print("Chi-square statistic:", chi_square_statistic)
print()
# Critical value
alpha = 0.05
# p-value
p_value = 1 - chi2.cdf(x=chi_square_statistic, df=ddof)
print('p-value:', p_value)
print()
# Significance level
print('Significance level:', alpha)
print('p-value:', p_value)
if p_value <= alpha:
print("Reject H0, Gender and preferred shirt color are independent")
else:
print("Fail to reject H0, Gender and preferred shirt color are not independent")
Dataset Table:
Black White Red Blue
Male 48 12 33 57
Female 34 46 42 26
Observed Values:
[[48 12 33 57]
[34 46 42 26]]
Expected Values:
[[41.27516779 29.19463087 37.75167785 41.77852349]
[40.72483221 28.80536913 37.24832215 41.22147651]]
Degree of Freedom: 3
Chi-square statistic: 22.597058622962738
p-value: 4.899565434945963e-05
Significance level: 0.05
p-value: 4.899565434945963e-05
Reject H0, Gender and preferred shirt color are independent
The Python code with the data,and detailed computation to apply Goodness-of-Fit Test is given at:
https://colab.research.google.com/drive/19WdSUSXTuuVL1bB1NhtG6xvqJrC2DPSS?usp=sharing