The main idea...
Linear discriminant analysis (LDA) is a method to evaluate how well a group of variables supports an a priori grouping of objects. It is based on work by Fisher (1936) and is closely related to other linear methods such as MANOVA, multiple linear regression, principal components analysis (PCA), and factor analysis (FA). In LDA, a grouping variable is treated as the response variable and is expected to be categorical. Groupings should reflect ecologically relevant knowledge, such as sampling environment or method, or reflect the results of an exploratory method such as cluster analysis or non-metric dimensional scaling. In the latter case, it is vital that the exploratory method was performed on an independent data set to avoid data dredging.
To evaluate groupings, a typical LDA performs an ANOVA or MANOVA on the explanatory variable(s) (i.e. any variable other than a grouping variable). If a significant difference among the groups is found, LDA attempts to find linear combinations of the explanatory variables that best discriminate between the groups defined. It then constructs discriminant functions based on these combinations. The resulting functions can be used to classify new objects described by the same explanatory variables used in the LDA; however, LDA itself is not a classification method. Further, the contribution of each of the explanatory variables to the discriminant functions may be examined allowing, for example, variables with high discriminatory power to be identified.
Figure 1: Schematic illustrating the discovery of a linear function that maximally discriminates between two groups described by two variables. Examining the original variables (a), there is some overlap between group distribution. A linear combination of these variables (b) shows better performance. A decision boundary can be built orthogonally from the linear combination (b, red dashed line). Group centroids are indicated by points and dispersion by coloured circles.
Results and evaluation
Implementations of LDA will often deliver the following results:
Evaluating the results of an LDA is best done empirically, through classification approaches. This can be done by using discriminant functions to classify a new set of objects described by the same variables into the same groups of the original, or 'training' data. Of course, objects must have known group membership. The misclassification (or error) rates will provide an indication of how trustworthy the discriminant functions are when faced with new data.
When more than two groups are described, multiple discriminant analysis (MDA) should be used. Note, however, that many LDA routines will automatically perform MDA when three or more groups are detected.
Key assumptions
The groups can be discriminated between by linear combinations of the explanatory variables.
Explanatory variables are continuous. Categorical explanatory variables should be evaluated by, e.g., discriminant correspondence analysis.
Each set of explanatory variables should show (close to) multivariate normal distributions within each group defined.
The groups should have (near) equal covariances.
There should be at least two groups.
There should be at least two objects per group.
Variables should be homoscedastic. If the mean of a variable is correlated with its variance, significance tests may be invalid.
There should be no linear dependency between explanatory variables.
Warnings
LDA is sensitive to outliers. These should be identified and treated accordingly.
LDA is only suitable when evaluating the variables' ability to linearly discriminate between any grouping.
Highly correlated variables will contribute very similarly to an LDA solution and may be redundant. Thus, variables that are uncorrelated are preferable.
While unequal group sizes can be tolerated, very large differences in group sizes can distort results, particularly if there are very few (< 20) objects per group.
If ANOVA/MANOVA tests on a given set of explanatory variables are insignificant, LDA is unlikely to be useful.
When interpreting the coefficients of a discriminant function, carefully distinguish between standardised and unstandardised coefficients.
Heteroscedasticity is likely to lead to invalid significance tests.
Implementations
R
The DiscriMiner package hosts a range of functions for discriminant analyses, including LDA.
The generic predict() function (from the stats pacakge) can be used to classify unknown objects into the classes of an LDA R object.
References
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen. 7(2):179-188.
Correa-Metrio A, Cabrera KR, Bush MB (2010) Quantifying ecological change through discriminant analysis: a paleoecological example from the Peruvian Amazon. J Veg Sci. 21(4): 695-704.