R mode measures for normalised abundance data

Consider using the linear measures below if your variables satisfy linear assumptions (e.g. variables have normally distributed residuals and the relationships between your variables are linear). Also, attempt to minimise the number of double zeros shared between variables. For example, consider removing variables or objects with many "0" values. Bear in mind, that any such modification of your data will impact the measures.

Pearson's r

Covariance

This familiar measure of linear correlation between two variables, suitable only for detecting linear relationships between variables. This is covariance between two variables divided by the product of their standard deviations. If your variables have many zeros, this correlation coefficient will not be reliable as double-zeros will be understood as an "agreement" when, in fact, they are simply the absence of an observation. This will inflate the correlation coefficient.

The unstandardised form of linear correlation, or the covariance, between variables may also be used. Variables should be centred on their means (i.e. all variables have a mean of "0") before calculating covariances as an R-mode measure.

Consider using the rank-order correlation coefficients below if linearisation fails or if you have many variables with many "0" values.

Spearman's rho

Kendall's τ

This is a non-parametric measure of correlation which uses ranks rather than the original variable values. Variables should have monotonic relationships: that is, their ranks should either go up or down across objects, but not necessarily in a linear fashion. Like Pearson's r, Spearman's rho is based on the principal of least squares, but is concerned with how strongly the rankings between two variables disagree. The larger the disagreement the lower the rho value. This statistic is sensitive to large disagreements. That is, if one variable ranks an object as "1" and another variable ranks the same object as "100", the correlation reported by Spearman's rho will be strongly affected (relative to Kendall's tau, for example), even if these variables agree on all other ranks. This measure is suitable for raw or standardized abundance data and any monotonically related variables.

Like the Spearman's rho, Kendall's tau uses ranked values to calculate correlation. This measure, however, is not based on the principal of least squares and instead expresses the degree of concordance between two rankings. The tau statistic is the quotient of 1) the difference between concordant and discordant pairs (i.e. ranks that agree and ranks that differ) and 2) the total number of pairs compared. This statistic is not sensitive to the scale of the disagreement. As above, variables should have monotonic relationships: that is, their ranks should either go up or down across objects, but not necessarily in a linear fashion. This measure is suitable for raw or standardized abundance data and any monotonically related variables.