RELATIONSHIP/ASSOCIATION/CORRELATION
When talking about a relationship we look at three ideas (unless there is no relationship)
When talking about a relationship we look at three ideas (unless there is no relationship)
We need to talk about whether there is a positive relationship (increasing), negative relationship (decreasing) or whether the data is staying the same. Note: in some cases data may both increase and decrease! We also need to explain what this means in contex
If a relationship is...
positive this means that as the x-variable increases so does the y-variable.
negative this means that as the x-variable increases the y-variable decreases.
With no relationship this means that the y-variable does not depend on the x-variable at all
Positive
Negative
Positive going to negative
Positive
No relationship
A relationship can be linear (straight line) or non-linear (curve).
If a relationship...
goes up by a constant amount for each step across it is linear
goes up or down by an ever increasing/decreasing amount it is non-linear
A relationship can be weak, moderate or strong. Some people use a mixture.
Please note: you cannot use the correlation coefficient to explain the strength of data. You must use you eyes and write an explanation. Once you have explained your decision the correlation coefficient can be used to justify this decision for the strength of relationship for a linear model
It's important to look out for anything different. At times you may have
two or more distinct groups in the data.
data fanning out
extreme values
It is important to comment on these. Sometimes these can lead to interesting discussions for excellence around improvements. Note: sometimes a weak relationship or data fanning is because there is more than one group or another variable may be responsible which could be good to remove to investigate things more in depth.
The scatter graph shows that there is positive relationship between a cars engine size and its horsepower. As the engine size increases the horsepower tends to increase as well.
There appears to be moderate positive linear relationship as when engine size increases the horsepower seems to increase by a fairly constant amount - approximately 51hp per litre increase. The data is fairly consistently spread a bit away from the trendline between 1l and 3 l engine sizes with more variation for larger engine sizes. The correlation coefficient of 0.79 reinforces the idea that it is a moderate relationship. There is an extreme value shown for an engine size of 8.3l and 500 hp.
Just because there is correlation we cannot say that the independent variable causes the dependent variable to increase! There is likely to be some form of lurking variable that is the real cause.
Otherwise we would get some pretty hilarious causes for things!!