1. Concepts & Definitions
1.1. A Review on Parametric Statistics
1.2. Parametric tests for Hypothesis Testing
1.3. Parametric vs. Non-Parametric Test
1.4. One sample z-test and their relation with two-sample z-test
1.5. One sample t-test and their relation with two-sample t-test
1.6. Welch's two-sample t-test: two populations with different variances
1.7. Non-Parametric test for Hypothesis Testing: Mann-Whitney U Test
1.8. Non-Parametric test for Hypothesis Testing: Wilcoxon Sign-Rank Test
1.9. Non-Parametric test for Hypothesis Testing: Wilcoxon Sign Test
1.10. Non-Parametric test for Hypothesis Testing: Chi-Square Goodness-of-Fit
1.11. Non-Parametric test for Hypothesis Testing: Kolmogorov-Smirnov
1.12. Non-Parametric for comparing machine learning
2. Problem & Solution
2.1. Using Wilcoxon Sign Test to compare clustering methods
2.2. Using Wilcoxon Sign-Rank Test to compare clustering methods
2.3. What is A/B testing and how to combine with hypothesis testing?
2.4. Using Chi-Square fit to check if Benford-Law holds or not
2.5. Using Kolmogorov-Smirnov fit to check if Pareto principle holds or not
What is a T-test?
T-test is a statistical test that is used to determine whether the mean of a sample is significantly different from a known population mean when the population standard deviation is known. It is particularly useful when the sample size is large (>30) [1].
This kind of statistical test had and the proper situation to apply had been described at Track 08, section 1.6.
A numerical example of a T-test
The average service time of a company in 2018 was 12.44 minutes. Management wants to know whether the arithmetic mean current is different from 12.44 minutes. A sample with 25 values had an arithmetic mean of 13.71 minutes and a standard deviation of 2.65 minutes. Using α = 5%, can you conclude whether the time is currently different?
Let's recall the development to solve the previously numerical example presented.
Null hypothesis (Ho): The mean had been not affected, i.e., μ = 12.44.
Alternate hypothesis (Ha): Then the mean had been affected, i.e., μ ≠ 12.44.
From the table described in the step to choose a statistical test, the sign of the hypotheses Ho and Ha point that a two-tailed test should be carried.
Since the sample is smaller than 30, then a student's T distribution could be employed.
The next table helps to understand the relation between confidence level, alpha (α), and the critical value t α/2 for the degrees of freedom (sample size - 1).
Using α = 5% and n = 25 (df = 25 - 1 = 24) will lead to t α/2 = 2.060.
To compute a statistical test is necessary to convert the observed value in the mean of the sample (x̄) to the scale of a standard normal distribution (Tobs). These could be done using the following equation:
tobs = (x̄ - μ)/(s/(n^0.5))
This equation will result in the following numbers:
tobs = (13.71 - 12.44)/(2.65/(25^0.5)) = (13.71-12.44)/0.53 = 2.40
Since tobs = 2.40 is higher than upper critical value t α/2 = 2.06, then we can reject the Null hypothesis.
When to use a T-test
The following conditions must hold to apply a T-test:
The sample size should be less than 30. Otherwise, we should use the z-test.
Samples should be drawn at random from the population.
The standard deviation of the population should be known.
Samples that are drawn from the population should be independent of each other.
The data should follow Student-T distribution.
Types of T-test
Assuming that [1]:
Null Hypothesis: The null hypothesis is a statement that the value of a population parameter (such as proportion, mean, or standard deviation) is equal to some claimed value. We either reject or fail to reject the null hypothesis. The null hypothesis is denoted by H0.
Alternate Hypothesis: The alternative hypothesis is the statement that the parameter has a value that is different from the claimed value. It is denoted by HA.
Level of significance: It means the degree of significance in which we accept or reject the null hypothesis. Since in most of the experiments 100% accuracy is not possible for accepting or rejecting a hypothesis, we, therefore, select a level of significance. It is denoted by alpha (∝)
Then:
Two-tailed T-test: The region of rejection is located to both extremes of the distribution. Here our null hypothesis is that the claimed value is equal to the mean population value.
Left-tailed T-Test: The region of rejection is located to the extreme left of the distribution. Here our null hypothesis is that the claimed value is less than or equal to the mean population value.
Right-tailed T-Test: The region of rejection is located to the extreme right of the distribution. Here our null hypothesis is that the claimed value is less than or equal to the mean population value.
Two-tailed T-Test
Right-tailed T-Test
Left-tailed T-Test
The Python code with the data, and detailed computation to generate the three types of T-test are given at:
https://colab.research.google.com/drive/1_nR2eMhvXxVY-HCVDjpRiQ2-2SV7TeqT?usp=sharing
Two-sampled T-test
Suppose are provided 2 T-student distributed and independent populations, and we have drawn samples at random from both populations. Here, we consider μ1 and μ2 to be the population mean, and x̄1 and x̄2 to be the observed sample mean. Here, our null hypothesis could be like this:
Null hypothesis (Ho): There is no difference between the means, i.e., μ1 - μ2 = 0.
Alternate hypothesis (Ha): Then the mean had been affected, i.e., μ1 - μ2 ≠ 0.
And the formula for calculating the t-test score tobs is given by [2, 3]:
Where x1 = first sample mean, x2 = second sample mean, s1 = first sample standard deviation, s2 = second sample standard deviation, n1 = first sample size, n2 = second sample size, and df (degree of freedom) = min(n1, n2) - 1.
A numerical example of a two-sample T-test
Let’s consider that the first factory shares 21 samples of ball bearings where the mean diameter of the sample comes out to be 10.5 cm. On the other hand, the second factory shares 25 samples with a mean diameter of 9.5 cm. Both have a standard deviation of 1 cm [3]:
Step 1: Pose the research question and determine the proper statistical test.
The company wants to determine if the performance of the employees in Factory 1 is different from the performance of the employees in the Factory 2. To do this, we will use a two-sample t-test for means.
Step 2: Obtain the samples statistics from the two factories (populations).
Factory 1: x̄1 = 10.5, σ1 = 1.
Factory 2: x̄2 = 9.5, σ2 = 1.
Step 3: Formulate the null and alternate hypotheses and set the level of significance for the test.
Null hypothesis (Ho): There is no difference between the performance of employees at different Factories. There is no difference between the means, i.e., μ1 - μ2 = 0.
Alternate hypothesis (Ha): There is a difference in the performance of the employees. Then the mean had been affected, i.e., μ1 - μ2 ≠ 0.
We will perform two-tailed test using confidence level (alpha), α = 5%.
Df (degree of freedom) = min(n1, n2) - 1 = 21 - 1 = 20
Step 4: Use the formula for two-sample t-test for means to calculate the z-test statistic tobs.
tobs = (x̄1 – x̄2 ) / √((s1 )²/n1 + (s2)²/n2)
tobs = (1) / √((1)²/21 + (1)²/25))
tobs = 3.378
Step 5: Compare tobs with the critical value t α/2 from the previous table.
Step 6: Conclude that Since tobs = 3.378 is higher than critical value t α/2 = 2.080, then we can reject the Null hypothesis.
A numerical example of a two-sample T-test in Python
The next Python code is useful to automate the manual computation made previously [2, 3].
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
# Step 2: Sample statistics
x1 = 10.5 # Sample mean of Factory 1
s1 = 1 # Sample standard deviation of Factory 1
n1 = 21 # Sample size of Factory 1
x2 = 9.5 # Sample mean of Factory 2
s2 = 1 # Sample standard deviation of Factory 2
n2 = 25 # Sample size of Factory 2
# Degrees of freedom
df = min(n1, n2) - 1
# Step 4: Calculate the t-test statistic
t_obs = (x1 - x2) / np.sqrt((s1 ** 2) / n1 + (s2 ** 2) / n2)
print(f"Tobs: {t_obs:.3f}")
# Step 5: Critical values for a two-tailed test at alpha = 0.05
alpha = 0.05
t_critical = stats.t.ppf(1 - alpha/2, df)
print(f"Critical value (t_alpha/2): ±{t_critical:.3f}")
# Step 6: Conclusion
if abs(t_obs) > t_critical:
conclusion = "Reject the null hypothesis"
else:
conclusion = "Fail to reject the null hypothesis"
print(conclusion)
# Plotting the results
x = np.linspace(-4, 4, 1000)
y = stats.norm.pdf(x)
plt.figure(figsize=(10, 6))
plt.plot(x, y, label='Standard Normal Distribution')
# Critical regions
plt.fill_between(x, y, where=(x < -t_critical) | (x > t_critical), color='red', alpha=0.3, label='Critical regions')
# Tobs line
plt.axvline(t_obs, color='blue', linestyle='--', label=f'Tobs = {t_obs:.3f}')
plt.text(t_obs - 0.1, max(y)*0.5, f'Tobs = {t_obs:.3f}', color='blue', ha='right')
# Critical region text
plt.text(-t_critical, max(y)*0.1, f'Critical region: -{t_critical:.3f}', color='red', ha='center')
plt.text(t_critical, max(y)*0.1, f'Critical region: {t_critical:.3f}', color='red', ha='center')
# Formatting the plot
plt.title('Two-Tailed t-Test')
plt.xlabel('t-value')
plt.ylabel('Probability Density')
plt.legend()
plt.grid(True)
plt.show()
Tobs: 3.378
Critical value (t_alpha/2): ±2.086
Reject the null hypothesis
The Python code with the data, and detailed computation to employ two-sample T-test to verify if the two classes have the same mean is given at:
https://colab.research.google.com/drive/1igHfmxU5P7veSXIA0eatBsnuBdvun8jY?usp=sharing