Pearson's r
Pearson's r
After looking at the scatter plot and seeing that a linear relationship between two variables seems to exist, what should you do? One option is to use the correlation coefficient to derive a number that represents the direction and exact strength of the relationship between x and y.
The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable x and the dependent variable y.
The correlation coefficient is calculated
where n = the number of data points.
If you suspect a linear relationship between x and y, then r can measure how strong the linear relationship is.
What the VALUE of r tells us: The value of r is always between –1 and +1: –1 ≤ r ≤ 1. The size of the correlation r indicates the strength of the linear relationship between x and y. Values of r close to –1 or to +1 indicate a stronger linear relationship between x and y. If r = 0 there is absolutely no linear relationship between x and y (no linear correlation). If r = 1, there is perfect positive correlation. If r = –1, there is perfect negative correlation. When we have a perfect positive or negative correlation, all of the data points lie on a straight line. In the real world, this will generally not happen.
What the SIGN of r tells us: A positive value of r means that when x increases, y tends to increase and when x decreases, y tends to decrease (positive correlation). A negative value of r means that when x increases, y tends to decrease and when x decreases, y tends to increase (negative correlation). The sign of r is the same as the sign of the slope, b, of the best-fit line.
Note
It is important to remember that a correlation does not indicate causation. What this means is that even if there are two variables that are strongly correlated, we still cannot suggest that x causes y or y causes x. We say “correlation does not imply causation.” To apply causation, one has to conduct an experiment.
References:
LICENSES AND ATTRIBUTIONS
CC LICENSED CONTENT, SHARED PREVIOUSLY
OpenStax, Statistics, The Regression Equation. Provided by: OpenStax. Located at: http://cnx.org/contents/30189442-6998-4686-ac05-ed152b91b9de@17.41:82/Introductory_Statistics. License: CC BY: Attribution
Introductory Statistics . Authored by: Barbara Illowski, Susan Dean. Provided by: Open Stax. Located at: http://cnx.org/contents/30189442-6998-4686-ac05-ed152b91b9de@17.44. License: CC BY: Attribution. License Terms: Download for free at http://cnx.org/contents/30189442-6998-4686-ac05-ed152b91b9de@17.44