Pearson's r
After looking at the scatter plot and seeing that a linear relationship between two variables seems to exist, what should you do? One option is to use the correlation coefficient to derive a number that represents the direction and exact strength of the relationship between x and y.
The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable x and the dependent variable y.
The correlation coefficient is calculated
where n = the number of data points.
If you suspect a linear relationship between x and y, then r can measure how strong the linear relationship is.
What the VALUE of r tells us: The value of r is always between –1 and +1: –1 ≤ r ≤ 1. The size of the correlation r indicates the strength of the linear relationship between x and y. Values of r close to –1 or to +1 indicate a stronger linear relationship between x and y. If r = 0 there is absolutely no linear relationship between x and y (no linear correlation). If r = 1, there is perfect positive correlation. If r = –1, there is perfect negative correlation. When we have a perfect positive or negative correlation, all of the data points lie on a straight line. In the real world, this will generally not happen.
What the SIGN of r tells us: A positive value of r means that when x increases, y tends to increase and when x decreases, y tends to decrease (positive correlation). A negative value of r means that when x increases, y tends to decrease and when x decreases, y tends to increase (negative correlation). The sign of r is the same as the sign of the slope, b, of the best-fit line.
Note
It is important to remember that a correlation does not indicate causation. What this means is that even if there are two variables that are strongly correlated, we still cannot suggest that x causes y or y causes x. We say “correlation does not imply causation.” To apply causation, one has to conduct an experiment.
LICENSES AND ATTRIBUTIONS
CC LICENSED CONTENT, SHARED PREVIOUSLY
OpenStax, Statistics, The Regression Equation. Provided by: OpenStax. Located at: http://cnx.org/contents/30189442-6998-4686-ac05-ed152b91b9de@17.41:82/Introductory_Statistics. License: CC BY: Attribution
Introductory Statistics . Authored by: Barbara Illowski, Susan Dean. Provided by: Open Stax. Located at: http://cnx.org/contents/30189442-6998-4686-ac05-ed152b91b9de@17.44. License: CC BY: Attribution. License Terms: Download for free at http://cnx.org/contents/30189442-6998-4686-ac05-ed152b91b9de@17.44