Central limit theorem for sample means

This section covers some important properties of the sampling distribution of the mean introduced in the demonstrations in this chapter and defined in the central limit theorem.

Mean

The mean of the sampling distribution of the mean is the mean of the population from which the scores were sampled. Therefore, if a population has a mean μ, then the mean of the sampling distribution of the mean is also μ. The symbol μM is used to refer to the mean of the sampling distribution of the mean. Therefore, the formula for the mean of the sampling distribution of the mean can be written as:

Variance

The variance of the sampling distribution of the mean is computed as follows:

That is, the variance of the sampling distribution of the mean is the population variance divided by N, the sample size (the number of scores used to compute a mean). Thus, the larger the sample size, the smaller the variance of the sampling distribution of the mean.

(optional) This expression can be derived very easily from the variance sum law. Let's begin by computing the variance of the sampling distribution of the sum of three numbers sampled from a population with variance σ2. The variance of the sum would be σ2 + σ2 + σ2. For N numbers, the variance would be Nσ2. Since the mean is 1/N times the sum, the variance of the sampling distribution of the mean would be 1/N2 times the variance of the sum, which equals σ2/N.

Standard Deviation (Standard Error)

The standard deviation of the sampling distribution of the mean is also called the standard error of the mean (SEM). It is therefore the square root of the variance of the sampling distribution of the mean and can be written as:

The standard error is represented by a σ because it is a standard deviation. The subscript (M) indicates that the standard error in question is the standard error of the mean.

Central Limit Theorem

The central limit theorem describes the sampling distribution of the mean. Like most distributions, we would like to know the mean (location), variance (spread) and shape of the sampling distribution of the mean. The central limit theorem specifies these parameters for us and states that given a population with a finite mean μ and a finite non-zero variance σ2, the sampling distribution of the mean:

1) Approaches a normal distribution with a

2) Mean of μ and a

3) Variance of σ2/N

The expressions for the mean and variance of the sampling distribution of the mean are not new or remarkable. What is remarkable is that regardless of the shape of the parent population, the sampling distribution of the mean approaches a normal distribution as N increases. If you have used the "Central Limit Theorem Demo," you have already seen this for yourself. As a reminder, Figure 1 shows the results of the simulation for N = 2 and N = 10. The parent population was a uniform distribution. You can see that the distribution for N = 2 is far from a normal distribution. Nonetheless, it does show that the scores are denser in the middle than in the tails. For N = 10 the distribution is quite close to a normal distribution. Notice that the means of the two distributions are the same, but that the spread of the distribution for N = 10 is smaller.

Figure 2 shows how closely the sampling distribution of the mean approximates a normal distribution even when the parent population is very non-normal. If you look closely you can see that the sampling distributions do have a slight positive skew. The larger the sample size, the closer the sampling distribution of the mean would be to a normal distribution.

Importance of the Central Limit Theorem: Z-scores for Sample Means

If we have a sample of 10 people, we would want to compare our sample mean to other samples comprised of 10 people, this seems the most fair comparison (we would not want to compare the mean of our sample of 10 to the score of a single individual or to the mean of a sample of 100 people). The central limit theorem enables us to know the population mean, standard deviation, and shape for any sampling distribution. Which lets us be able to compute z-scores and probabilities for our sample means! This is immensely important for inferential statistics.

The z-score for a sample mean from a random variable in one sample would be:

____________________________________________________________

PRACTICE 1

The mean number of minutes for app engagement by a tablet user is 8.2 minutes. Suppose the standard deviation is one minute. Take a sample of 60.

  1. What are the mean and standard deviation for the sample mean number of app engagement by a tablet user?

  2. What is the standard error of the mean?

____________________________________________________________

References:

  1. https://courses.lumenlearning.com/introstats1/chapter/the-central-limit-theorem-for-sample-means-averages/

CC LICENSED CONTENT, SHARED PREVIOUSLY

ALL RIGHTS RESERVED CONTENT

  • Sampling distribution of the sample mean | Probability and Statistics | Khan Academy. Authored by: Khan Academy. Located at: https://www.youtube.com/embed/FXZ2O1Lv-KE. License: All Rights Reserved. License Terms: Standard YouTube License

  • Standard error of the mean | Inferential statistics | Probability and Statistics | Khan Academy. Authored by: Khan Academy. Located at: https://www.youtube.com/embed/J1twbrHel3o. License: All Rights Reserved. License Terms: Standard YouTube LIcense

  1. Online Statistics Education: A Multimedia Course of Study (http://onlinestatbook.com/). Project Leader: David M. Lane, Rice University http://onlinestatbook.com/2/sampling_distributions/samp_dist_mean.html

Answer

Practice 1.

  1. μx=μ=8.2, σx = σ/√n= 1/√60= 0.13. This allows us to calculate the probability of sample means of a particular distance from the mean, in repeated samples of size 60.

  2. The standard error of the mean is the standard deviation of the sampling distribution of the mean, which we just computed to be 0.13.