Normal Distribution Introduction

Introduction to the Normal Distribution

If you ask enough people about their shoe size, you will find that your graphed data is shaped like a bell curve and can be described as normally distributed. (credit: Ömer Ünlϋ)

The normal, a continuous distribution, is the most important of all the distributions. It is widely used and even more widely abused. Its graph is bell-shaped. You see the bell curve in almost all disciplines. Some of these include psychology, business, economics, the sciences, nursing, and, of course, mathematics. Some of your instructors may use the normal distribution to help determine your grade. Most IQ scores are normally distributed. Often real-estate prices fit a normal distribution. The normal distribution is extremely important, but it cannot be applied to everything in the real world.

Many variables, such as weight, shoe sizes, foot lengths, and other human physical characteristics, exhibit these normal distribution properties. The symmetry indicates that the variable is just as likely to take a value a certain distance below its mean as it is to take a value that same distance above its mean. The bell shape indicates that values closer to the mean are more likely, and it becomes increasingly unlikely to take values far from the mean in either direction.

We use a mathematical model with a smooth bell-shaped curve to describe these bell-shaped data distributions. These models are called normal curves or normal distributions. They were first called “normal” because the pattern occurred in many different types of common measurements.

In this section, you will study the normal distribution, the standard normal distribution, and applications associated with them.

Because normal curves are mathematical models, we use Greek letters to represent the mean and standard deviation of a normal curve. The mean of a normal distribution locates its center. We use the Greek letter μ (pronounced “mu” ) to represent the mean. We use the Greek letter σ (pronounced “sigma”) to represent the standard deviation of a normal distribution. The standard deviation determines the spread of the distribution. In fact, the shape of a normal curve is completely determined by specifying its standard deviation

The normal distribution has two parameters (two numerical descriptive measures), the mean (μ) and the standard deviation (σ). If X is a quantity to be measured that has a normal distribution with mean (μ) and standard deviation (σ), we designate this by writing

The probability density function is a rather complicated function. Do not memorize it. It is not necessary.

The cumulative distribution function is P(X < x). It is calculated either by a calculator or a computer, or it is looked up in a table. Technology has made the tables virtually obsolete. For that reason, as well as the fact that there are various table formats, we are not including table instructions.

The curve is symmetrical about a vertical line drawn through the mean, μ. In theory, the mean is the same as the median, because the graph is symmetric about μ. As the notation indicates, the normal distribution depends only on the mean and the standard deviation. Since the area under the curve must equal one, a change in the standard deviation, σ, causes a change in the shape of the curve; the curve becomes fatter or skinnier depending on σ. A change in μ causes the graph to shift to the left or right. This means there are an infinite number of normal probability distributions. One of special interest is called the standard normal distribution. The following video gives an example of data that would fall into a normal distribution.

Observations of Normal Distributions

There are many normal curves. Even though all normal curves have the same bell shape, they vary in their center and spread.

As we will see, if two normal distributions have the same standard deviation, then the shapes of their normal curves will be identical.

Following are some observations we can make as we look at the figure above:

The black and the red normal curves have means or centers at μ = 10. However, the red curve is more spread out and thus has a larger standard deviation. Notice that the red normal curve is also shorter. This makes sense because these curves are probability density curves, so the area under each curve has to be 1.

The black and the green normal curves have the same standard deviation or spread.

The normal curve has a central role in statistical inference, as we’ll see in Linking Probability to Statistical Inference. Understanding the normal distribution is an important step in the direction of our overall goal, which is to relate sample means or proportions to population means or proportions. The goal of this section is to help you better understand normal random variables and their distributions.

All normal curves share a basic geometry. While the mean locates the center of a normal curve, it is the standard deviation that is in control of the geometry. To see how, let’s examine a few pictures of normal curves to see what they reveal.

___________________________________

EXAMPLE

Let's start with a random variable X that has a normal distribution with a mean of 10 and a standard deviation of 2. Let's practice our new notation. Here we would write μ = 10 and σ = 2.

The normal curve for X is shown below.

As expected, the mean μ = 10 is located at the center of the normal curve. The other two arrows point to values 1 standard deviation on each side of the mean.

The point 1 standard deviation less than the mean is represented by μ - σ, which is 8, as shown.

The point 1 standard deviation more than the mean is represented by the point μ + σ = 12, as shown.

You will notice we have indicated that the area of the green region is 0.68. So we can say that the probability of X being between 8 and 12 equals 0.68.

Or, using our probability notation, we could write

P(8 < X< 12) - 0.68

Now here is an interesting fact. If we took any normal distribution and drew a similar picture, the probability that a value falls within 1 standard deviation of the mean is always the same. Here are several ways to express this idea.

For any normal curve the central area within 1 standard deviation of the mean equals 0.68.

Roughly 68% of the time we expect X to have a value within 1 standard deviation of the mean.

P(mu - sigma < X < mu + sigma) = 0.68.

This is a big deal for statistics. It is one of the things that makes normal curves special. In general, probability density curves for continuous random variables while different shapes don’t have this special property.

Let’s put this idea in context. If the weight of babies at birth follows a normal distribution with a mean = 3500 grams, and a standard deviation of 600 grams, then we can conclude that most babies (that is, about 68%) will weigh somewhere between 2900 grams and 4100 grams.

___________________________________

References

Click here to go back to main chapter page

Report abuse