Head/tail breaks is a classification method used to visualize and analyze data with a heavy-tailed distribution, which is common in geographic and complex systems data. This method was developed by Bin Jiang, and it is particularly useful for dealing with data sets where a small number of high values (the "head") and a large number of low values (the "tail") exist.
How the head/tail breaks process works:
· Initial Division: Start by calculating the mean of the entire data set.
··Separation: Divide the data into two groups based on whether they are above or below this mean. The values above the mean form the "head," and the values below form the "tail."
·Recursive Application: Repeat the process for the head group. Calculate the mean of the head group, and again split it into new head and tail groups.
· Stopping Condition: Continue this recursive splitting until the head group is no longer significantly larger than the tail, or until some other predefined stopping criterion is met.
The head/tail breaks method is particularly useful for visualizing and understanding data that follow a power-law distribution or have a natural skewness, where traditional classification methods like equal intervals or quantiles might not effectively capture the underlying structure of the data. It helps to highlight the most significant data points and patterns, making it easier to understand complex data sets with significant variance.
Head/tail break, Jiang, Bin & Huang, Ju Tzu. (2021). A new approach to detecting and designing living structure of urban environments