Cluster analysis

The main idea...

Cluster analysis describes a class of techniques tasked with placing objects in groups, called clusters. Dissimilarities between objects within these groups should be smaller than those between groups. The definition of a cluster varies, and different cluster analysis techniques may approach the problem very differently. Below, two widely-used approaches are noted and links to pages summarising them are available.

Hierarchical cluster analysis

Hierarchical cluster analysis may be performed using an "object x object" matrix of (dis)similarities or distances. It attempts to find a good, although perhaps not the best, grouping of objects based on the distances supplied in a hierarchical manner, first grouping objects with the lowest dissimilarities before proceeding. A diverse range of algorithms and clustering criteria are available to detect different groupings in data sets. Click here to find out more...

Non-hierarchical cluster analysis

A popular method of non-hierarchical cluster analysis, K-means clustering, may use a (dis)similarity matrix as input, but does not require one. This method attempts to find a grouping of objects that optimise some evaluating criterion (which may be a (dis)similarity measure) by iteratively reassigning objects to groups in such a way as to improve the criterion value. Click here to find out more...