1.7. Normal versus Student's T distributions

Now that we’ve seen both the standard normal distribution and a t-distribution with a single degree of freedom, let’s plot them together to see how they compare.

# Library imports

import numpy as np

import seaborn as sns

import scipy.stats as stats

import matplotlib.pyplot as plt

%matplotlib inline

# Normal distribution

x = np.linspace(-4, 4, 500)

y = stats.norm.pdf(x)

# T distribution

df = 1

y_t = stats.t.pdf(x, df)

# Plotting

plt.ylabel('Probability Density')

plt.xlabel('Standard Deviations')

plt.plot(x, y, color='blue', label='Normal Dist.')

plt.plot(x, y_t, color='green', label=f'T-Dist., df={df}')

plt.legend()

# Styling - optional

sns.set_context('notebook')

sns.despine()

Key differences

With only a single degree of freedom, the t-distribution is much flatter and has fatter tails than the standard normal distribution. The power of the t-distribution comes from its ability to adjust for smaller sample sizes (and therefore less degrees of freedom) by effectively having a more conservative estimate of probability density. Put another way, the t-distribution adjusts for a natural decrease in confidence at lower sample sizes that the normal distribution does not account for.

The impact of degree of freedom on Student's T distribution

With the change in the degree of freedom of the t-distribution with fixed location parameter number of points located at mean changes (height of t-distribution changes). The next code helps to illustrate this point.

import matplotlib.pyplot as plt

import numpy as np

from scipy.stats import t

x = np.linspace(-5, 5, 100)

degrees_of_freedom = [1, 2, 5, 10] # Varying degrees of freedom

# Plotting T-distribution curves for different degrees of freedom

for df in reversed(degrees_of_freedom):

y = t.pdf(x, df) # Using default location and scale parameters (0 and 1)

plt.plot(x, y, label=f"Degrees of Freedom = {df}")

plt.xlabel('x')

plt.ylabel('PDF')

plt.title('T-Distribution with Varying Degrees of Freedom')

plt.legend()

plt.show()

Student's T distribution approximates the normal distribution

At higher degrees of freedom, the t-distribution approximates the normal distribution, making it useful at both small and large sample sizes. The next code helps to illustrate this point.

# Library imports

import numpy as np

import seaborn as sns

import scipy.stats as stats

import matplotlib.pyplot as plt

%matplotlib inline

# Normal distribution

x = np.linspace(-5, 5, 500)

y = stats.norm.pdf(x)

plt.plot(x, y, color='blue', label='Normal Dist.')

# T distribution

# Plotting T-distribution curves for different degrees of freedom

for df in reversed(degrees_of_freedom):

y = t.pdf(x, df) # Using default location and scale parameters (0 and 1)

plt.plot(x, y, label=f"Degrees of Freedom = {df}")

# Plotting

plt.ylabel('Probability Density')

plt.xlabel('Standard Deviations')

#plt.plot(x, y_t, color='green', label=f'T-Dist., df={df}')

plt.legend()

# Styling - optional

sns.set_context('notebook')

sns.despine()

The previous complete code is available in the following link:

https://colab.research.google.com/drive/1oaJLYH-3HOWi5kRCqF4wszNVOQiaAAUg?usp=sharing

References:

[1] Comparing normal and Student's T distribution:

https://tjkyner.medium.com/the-normal-distribution-vs-students-t-distribution-322aa12ffd15

[2] Student's T distribution and the impact of degrees of freedom

https://www.geeksforgeeks.org/python-students-t-distribution-in-statistics/

Page updated

Google Sites

Report abuse