Numerical analysis

Conducting a quantitatively-based classification of vegetation involves several steps. Most of them are explained in detail in specific sub-pages of this VCM section, but we provide here a brief summary. We restrict ourselves to classification of vegetation based on plot data, and we group the steps into three main ones: sampling, definition of types and classification of new observations.

Sampling vegetation

The most important step of vegetation classification is data gathering. Any classification refers to the vegetation a given geographical area (i.e. extent). The location of plots within this area is crucial to ensure that there are not environments (i.e. climatic conditions) or parts within the target geography that are undersampled. Once the locations are decided, the vegetation attributes that should be measured and the way we measure them is an important decision. This depends on the vegetation attributes on which classification should be based (see below).

Definition of vegetation types from vegetation observations

Clustering (or unsupervised classification) allows generating a set of classes from observations. Intuitively, similar plot records should belong to the same class, whereas dissimilar plot records should belong to different classes. Frequent decisions to be made for clustering include:

    • What are the vegetation attributes on which classification should be based? (e.g. species composition vs. physiognomy).
    • To explicitly or implicitly decide how resemblance between plot records should be assessed. This includes applying data transformations and choosing a similarity-dissimilarity index.
    • Do we need to classify all plot records unequivocally to vegetation units? or some of them may be considered transitional, or even left unclassified?
    • Do we want to specify properties of vegetation units (e.g. amount of variation, shape) or should this be left open?
    • Do we need prototypes for our vegetation units? Should these prototypes be real plots, or they can be abstract entities.

Once clustering methods have been executed we are not yet done. We need to determine whether the units are valid for the practical use we want to make of them. That is, we need to evaluate and validate our units. Are the units sufficiently distinct? Can we derive simple diagnostic methods? Do they represent the known variation in vegetation across our landscape? Is the level of abstraction of vegetation appropriate?

Classification of new vegetation observations to pre-existing vegetation types

Once vegetation types have been defined, they are ready to be used to describe/summarize vegetation patterns. When surveying/monitoring the vegetation existing in a given territory, new vegetation observations (i.e. relevés or plot records) will be made. Classifying these observations into a predefined vegetation classification scheme may be useful for many applied purposes, such as for vegetation mapping or for assessing the conservation status of the plant community. We refer to such activity as assignment.