Redundancy Analysis

The main idea...

Redundancy analysis (RDA) is a method to extract and summarise the variation in a set of response variables that can be explained by a set of explanatory variables. More accurately, RDA is a direct gradient analysis technique which summarises linear relationships between components of response variables that are "redundant" with (i.e. "explained" by) a set of explanatory variables. To do this, RDA extends multiple linear regression (MLR) by allowing regression of multiple response variables on multiple explanatory variables (Figure 1). A matrix of the fitted values of all response variables generated through MLR is then subject to principal components analysis (PCA)

RDA can also be considered a constrained version of principal components analysis (PCA), wherein canonical axes - built from linear combinations of response variables - must also be linear combinations of the explanatory variables (i.e. fitted by MLR). The RDA approach generates one ordination in the space defined by the matrix of response variables and another in the space defined by the matrix of explanatory variables. Residuals generated by the MLR step, which yield non-canonical axes, may also be ordinated. Detailed discussion is available in Legendre and Legendre (1998).

Figure 1: Redundancy analysis regresses multiple response variables (y1...yn) on multiple explanatory variables (x1...xn). This is accomplished by performing an MLR for each response variable in turn. Only the fitted values of the response variables will be used to describe the variation in the data set.

Pre-analysis

Results and interpretation

Reading RDA biplots and triplots

RDA ordinations may be presented as a biplot or triplot (Figure 2). The interpretation of these plots depends on what scaling has been chosen. In general, consider type I scaling if the distances between objects are of particular value or if most explanatory variables are binary or nominal. Consider type II scaling if the correlative relationships between variables are of more interest. Further interpretation is discussed below. More detail is available in Legendre and Legendre (1998) and ter Braak (1994).

Figure 2: Schematic representation of a) an RDA biplot and b) an RDA triplot. a) An RDA biplot ordinates objects as points and either response or explanatory variables as vectors (red arrows). Levels of nominal variables are plotted as points (red). b) In a triplot, objects are ordinated as points (blue) while both response and explanatory variables (red and green arrows resp.) are plotted as vectors. Levels of nominal variables are plotted as points (green). Note that default visualisations vary among implementations. Interpretation of the plots, dependent on scaling, is discussed in text.

Type I Scaling - Distance plots (object focused)

Type II Scaling - Correlation plots (response variable focused)

Figure 3: Schematics highlighting a) the projection of ordinated objects onto a vector and b) the angles between vectors. The projection of an ordinated point onto a variable vector, as shown for point i in panel a, approximates the variable's value realised for that object. Hence, visual inspection suggests object i can be expected to have higher values of variable 1 relative to most other objects. Object ii, however, can be expected to have lower values of variable 1 relative to other objects. Note that the dashed line is not typically shown in a biplot and is shown here for clarity. When using type II scaling, the cosines of angles between vectors (panel b) approximate the correlation between the variables they represent. In this case, ∠a is approaching 90, which suggests that variables "1" and "2" show very little correlation (i.e. they are almost orthogonal, just as independent axes are). b is less than 90°, suggesting positive correlation between variables "2" and "3" while c is approaching 180°, suggesting strong negative correlation between variables "2" and "4" (i.e. the directions of increase of variables "2" and "4" oppose one another). Variable 5 is non-quantitative and is represented by a centroid. A right-angled projection onto variable 4 suggests the two are positively linked.

Warnings

Walkthroughs featuring RDA

Implementations

MASAME RDA app

    Click here to launch...

References