1.6. Code for small sample for test of hypothesis to mean

1. Concepts & Definitions

1.1. Defining statistical test of hypothesis

1.2. Numerical example of test of hypothesis for mean

1.3. Code for test of hypothesis for mean

1.4. Code for right tailed test of hypothesis for mean

1.5. Code for left tailed test of hypothesis for mean

1.6. Code for small sample hypothesis for mean

1.7. P-Value and test of hypothesis

1.8. Statistical power and power analysis

1.9. Shapiro Wilk for normality test

2. Problem & Solution

2.1. Shapiro Wilk to verify CLT Simulator

2.2. Shapiro Wilk for HS6 code weight samples

2.3. Test of hypothesis for weight of HS6 code

Numerical example and its solution summary

The average service time of a company in 2018 was 12.44 minutes. Management wants to know whether the arithmetic mean current is different from 12.44 minutes. A sample with 25 values had an arithmetic mean of 13.71 minutes and a standard deviation of 2.65 minutes. Using α = 5%, can you conclude whether the time is currently different?

Let's recall the development to solve the previously numerical example presented.

Null hypothesis (Ho): The mean had been not affected, i.e., μ = 12.44.
Alternate hypothesis (Ha): Then the mean had been affected, i.e., μ ≠ 12.44.
From the table described in the step to choose a statistical test, the sign of the hypotheses Ho and Ha point that a two-tailed test should be carried.
Since the sample is smaller than 30, then a student's T distribution could be employed.
The next table helps to understand the relation between confidence level, alpha (α), and the critical value t α/2 for the degrees of freedom (sample size - 1).

The code to create the previous table is given as follows.

from scipy.stats import t

import pandas as pd

alpha = 0.05

alpha_list = [0.2, 0.1, 0.05, 0.02, 0.01, 0.002, 0.001]

df_dict = []

for df in range(1,26): # degrees of freedom

v_list = []

for alpha in alpha_list: # significance level alpha = 0.05 = 5%

v = round(t.ppf(1 - alpha/2, df),3)

v_list.append(v)

df_dict.append(v_list)

cols = [str(x) for x in alpha_list]

df = pd.DataFrame(df_dict, columns=cols)

The previous complete code is available in the following link:

https://colab.research.google.com/drive/1P7DRHjbNfrVRrJqe2RL4w4uuvfgW5Osm?usp=sharing

Numerical example and its solution summary - more steps

Using α = 5% and n = 25 (df = 25 - 1 = 24) will lead to t α/2 = 2.060.
To compute a statistical test is necessary to convert the observed value in the mean of the sample (x̄) to the scale of a standard normal distribution (Tobs). These could be done using the following equation:

tobs = (x̄ - μ)/(s/(n^0.5))

This equation will result in the following numbers:

tobs = (13.71 - 12.44)/(2.65/(25^0.5)) = (13.71-12.44)/0.53 = 2.40

Since tobs = 2.40 is higher than upper critical value t α/2 = 2.06, then we can reject the Null hypothesis.

The corresponding Python code

The previous solution could be summarized by following Python code. First, let's compute the critical value Tcrit for a specified α.

from scipy.stats import t

n = 25

gl = n - 1

alfa = 0.05

p = 1 - alfa/2

ts = t.ppf(p,gl)

print("Given alpha = ",str(round(alfa,2)),", Tcrit = ",round(ts,2))

Given alpha = 0.05 , Tcrit = 2.06

The next step is to compute the Tobs, and comparing it with Tcrit, and make a decision

mi = 12.44

H0 = "The value is equal to " + str(mi)

n = 25

xb = 13.71

s = 2.65

sx = s/(n**(0.5))

tobs = (xb - mi)/sx

print("Tobs = ",tobs,"and Tcrit = ",ts)

if (tobs > ts)|(tobs < ts): # Zobs belongs to the critical region

print("Reject H0: ",H0)

else:

print("Do not reject H0: ",H0)

Tobs = 2.396226415094342 and Tcrit = 2.0638985616280205

Reject H0: The value is equal to 12.44

Finally, the next code help to visualize the critical regions of the test hypothesis with Tcrit, and the Tobs, help to compare their values to make a decision.

import numpy as np

import matplotlib.pyplot as plt

from scipy.stats import t

x1 = np.arange(-ts, ts, 0.001) # range of x in spec

x_all = np.arange(-10, 10, 0.001) # entire range of x, both in and out of spec

# mean = 0, stddev = 1, since Z-transform was calculated

y1 = t.pdf(x1,gl)

y_all = t.pdf(x_all,gl)

# build the plot

fig, ax = plt.subplots(figsize=(9,6))

plt.style.use('fivethirtyeight')

ax.plot(x_all,y_all)

mut = 0

ax.text(mut, 0.4, 'Ho', fontsize=14)

ax.text(mut-1.2*ts, 0.1, 'Ha', fontsize=14)

ax.text(mut+ts, 0.1, 'Ha', fontsize=14)

ax.fill_between(x1,y1,0, alpha=0.7, color='g')

ax.fill_between(x_all,y_all,0, alpha=0.1, color='r')

ax.set_xlim([-6,6])

ax.set_xlabel('# of Standard Deviations Outside the Mean')

ax.set_yticklabels([])

ax.set_title('Student T Curve CL = '+str(round((1-alfa)*100,0))+' %')

# drawing Tobs

ax.axvline(x = tobs, color = 'r', label = 'Zobs')

ax.text(1.1*tobs, 0.1, 'Tobs', fontsize=14)

The previous complete code is available in the following link:

https://colab.research.google.com/drive/1Q3V3WvMsjwUngr19SxhAmbalCzay7qMc?usp=sharing

References:

[1]https://predictivehacks.com/how-to-perform-a-students-t-test-in-python/

Page updated

Google Sites

Report abuse