Correlation
What is correlation?
According to the University of Texas at Austin (2016), Correlations are associations between variables, and if an association or a relationship exists between two variables, it means that the average value of one variable changes as the value of the other variable is changed.
Correlation is used to measure and predict the relationships between variables. In other words, in order to measure the relationship or association of two variables, the notion of correlation coefficient is used. A correlation coefficient can vary from -1.00 to +1.00. When dealing with sample data, it is highly unlikely to ever get a correlation value of exactly +1, or -1.
Why does correlation matter?
Correlation is a measure of association. It a measure of the degree of relatedness of variables (Black et at, 2015).
In purposes, Siegel (2012) introduces three basic goals when studying of correlation:
1-Describing and understanding the relationship-this is the most general goal which providing background information to understand how the world works because the world is filled with relationships, such s as between government intervention and the state of the economy, between quality of products and costs. In addition, Siegel (2012) advises that knowing which factors interact most strongly with each other will help us gain perspective necessary for long range planning and designing other strategies.
2-Forecasting and predicting a new observation-It is important to acquire skills to implement information about one of the measurements in order to predict and make decision for future actions.
3- Adjusting and controlling a process- When intervening in a process, understanding of a direct relationship between intervention and result helps us make the best possible adjustments for the outcomes.
References
Black, K.; Asafu-Adjaye, J.; Khan, N.; King, G.;Perea, N.; Sherwood, C.; Verma, R. ; Wasimi, S. (2013). Australian business statistics. published by John Wiley & Sons.
Siegel, A.F. (2012). Practical Business Statistics (6th Edition).Oxford UK: Elsevier Inc.
Some different sources of information for understanding of correlation are collected as follows:
References - Sources & Author/s
About Correlation
-Using Pearson's r as a measure of the correlation between two quantitative variables.
Properties of Pearson's r
1-Correlation is unit free; the x and y variables do NOT need to be on the same scale (e.g., it is possible to compute the correlation between height in
centimeters and weight in pounds)
2 −1≤ r ≤ +1
3-For a positive association r > 0,
for a negative association r < 0,
if there is no relationship r =0
4-Pearson's r measures the linear association between two quantitative variables; if the relationship between the two variables is not linear, Pearson's r is not an appropriate measure
5-The closer r is to 0 the weaker the relationship and
- the closer to +1 or – 1 the stronger the relationship
(e.g., r =−.88 is a stronger relationship than r = +.60);
- the sign of the correlation provides direction only
6-Correlation can be affected by outliers
The following table may serve as a guideline when evaluating correlation coefficients
Correlations are associations between variables.
Variables are things we measure that can differ from one observation to the next.
When an association exists between two variables, it means that the average value of one variable changes as we change the value of the other variable.
A correlation is the simplest type of association - linear.
When a correlation is weak (C), it means that the average value of one variable changes only slightly (only occasionally) in response to changes in the other variable.
In some cases, the correlation may be positive (A, C), or it may be negative (B).
If the points in such a graph pretty much fall inside a circle or horizontal ellipse such that the "trend-line" through them is horizontal, then a correlation does not exist (the same as a zero or no correlation).
When either or both variables cannot be assigned numbers, a correlation may still exist but we no longer apply the terms positive and negative (in D, depending on the nature of the variables).
Since a correlation is an association among variables, a correlation cannot exist with just one variable; this is not the same as a zero correlation or no correlation.
A graph of points with only one variable would have all points on a perfectly horizontal line or a perfectly vertical line (with no scatter around the line).
The correlation between the two variables:
The basis of phenomena co-occurring, of two variables, generates a result of independent variation. Theories and ideas about phenomena, regarding the interrelatedness between two these variables, usually are based on assumed correlations.
Correlation addresses the relationship between two different factors or variables.
The statistic is called a correlation coefficient. A correlation coefficient can be calculated when there are two (or more) sets of scores for the same individuals or matched groups.
A correlation coefficient describes direction [positive or negative] and degree [strength] of relationship between two variables.
The higher the correlation coefficient, the stronger the relationship.
The coefficient also is used to obtain a p value indicated whether the degree of relationship is greater than expected by chance.
For correlation, the null hypothesis is that the correlation coefficient = 0.
Interpreting correlation coefficients
Correlation can be positive or negative, depending upon the direction of the relationship.
If both factors increase and decrease together, the relationship is positive.
If one factor increases as the other decreases, then the relationship is negative.
It is still a predictable relationship, but inverse, changing in opposite rather than same direction.
Plotting a relationship on a graph [called a scatterplot] provides a picture of the relationship between two factors [variables].
Scatterplot (also known as scattergram or scatter chart)
A correlation coefficient can vary from -1.00 to +1.00.
The closer the coefficient is to zero (from either + or -), the less strong the relationship.
The sign indicates the direction of the relationship: plus (+) = positive, minus (-) = negative.
The common usage of the word correlation refers to a relationship between two or more objects (ideas, variables...).
In statistics, the word correlation refers to the relationship between two variables.
How can you tell by inspection the type of correlation?
If the graph of the variables represent a line with positive slope, then there is a positive correlation (x increases as y increases).
If the slope of the line is negative, then there is a negative correlation (as x increases y decreases).
An important aspects of correlation is how strong it is.
The strength of a correlation is measured by the correlation coefficient r.
Another name for r is the Pearson product moment correlation coefficient in honor of Karl Pearson who developed it about 1900.
There are at least three different formulae in common used to calculate this number and these different formulae somewhat represent different approaches to the problem.
1-the raw score formula
2-the deviation score formula
3-the covariance formula
The closer r is to +1, the stronger the positive correlation is.
The closer r is to -1, the stronger the negative correlation is.
If |r| = 1 exactly, the two variables are perfectly correlated
Correlation is the statistical concept which describes the amount and type of relationship between two variables.
Do the two variables vary together?
Do the two variables vary together or are they unrelated (independent)?
Correlation Coefficient
The correlation coefficient is a statistic (like the mean or the variance). It has a complicated formula. You enter the data in that formula and come out with a single number which is called the correlation coefficient. There are many kinds of correlation coefficients;
Correlation
When two sets of data are strongly linked together we say they have a High Correlation.
The word Correlation is made of Co- [meaning "together"], and Relation
Correlation can have a value:
1 is a perfect positive correlation
0 is no correlation (the values don't seem linked at all)
-1 is a perfect negative correlation
The value shows how good the correlation is (not how steep the line is), and if it is positive or negative.
https://www.mathsisfun.com/data/correlation.html
Correlation is a statistical technique that can show whether and how strongly pairs of variables are related.
Techniques in Determining Correlation
The most common type, called the Pearson or product-moment correlation.
The module also includes a variation on this type called partial correlation. The latter is useful when you want to look at the relationship between two variables while removing the effect of one or two other variables.
Like all statistical techniques, correlation is only appropriate for certain kinds of data.
Correlation works for quantifiable data in which numbers are meaningful, usually quantities of some sort. It cannot be used for purely categorical data, such as gender, brands purchased, or favorite colour.
Correlation Coefficient
The main result of a correlation is called the correlation coefficient (or "r"). It ranges from -1.0 to +1.0. The closer r is to +1 or -1, the more closely the two variables are related.
Correlation is a statistical measurement of the relationship between two variables.
Possible correlations range from +1 to –1.
A zero correlation indicates that there is no relationship between the variables.
A correlation of –1 indicates a perfect negative correlation, meaning that as one variable goes up, the other goes down.
A correlation of +1 indicates a perfect positive correlation, meaning that both variables move in the same direction together.
xgcf
Daniel Kahneman (2012). Thinking, Fast and Slow. Great Britain. Penguin Random House UK
Correlation is a statistical measure (expressed as a number) that describes the size and direction of a relationship between two or more variables. A correlation between variables, however, does not automatically mean that the change in one variable is the cause of the change in the values of the other variable.
Useful links
Scatter diagrams - Types of correlation
Video watching activities
1.1-Correlation - 14'.25
Posted by MathNStats
1.2-Correlation - 9'.49
Posted by MathNStats
2-Correlation and causality | Statistical studies | Probability and Statistics | Khan Academy - 10'44
3-How to calculate a correlation in Excel - by Quantitative Specialists - 5'.57"