Natural Clustering

This is an an unorthodox and effective non-iterative procedure for spherical clusters. It uses natural Bézier functions to determine initial cluster locations using the content of the data. The natural Bernstein-Bézier functions are very robust in representing data through continuous functions in the application of functional data analysis. This paper demonstrates that they are equally robust at resolving data clusters in classification problems. The original data is scaled and segmented. A natural Bézier function is fitted for each segment and the initial clusters are centered at the function extremums that are distinctly located. A self-selection process based on least distance is used to assign the data to these initial cluster centers. A minimum membership count is imposed and nearby clusters are combined to reduce these initial cluster centers based on visual clues. Centroid re-calculation and data reassignment can be used for centroid convergence. This approach requires no iteration. This method is new and different from other data clustering methods available in the literature. It is better than the standard k-means clustering method since it does not require information on the number of clusters or cluster membership count. The method is non-iterative and does not require random initialization or distance optimization.

Data from: University of Eastern Finland, “Clustering Benchmark Datasets”, https://cs.joensuu.fi/sipu/datasets/

Data: s1.txt

Natural Clustering

K-means++ (Python Scikit-learn)

Time : 0.528 seconds

Time: 1.524 seconds