Clustering

Cluster ensemble and multi-objective optimization

The selection of the best clustering algorithm for a given dataset is one of the main difficulties in cluster analysis. In fact, there is a large number of clustering algorithms, each one looking for clusters according to a different cluster definition (or clustering criterion). These algorithms search for a homogeneous structure (all clusters conforming to the same cluster definition), whereas data can present a heterogeneous structure (each cluster conforming to a different cluster definition).

A classical approach to address these problems is by using a clustering validation technique. However, most of these techniques are biased toward a given clustering criterion (e.g., cluster compactness). Alternatively, the problems of algorithm selection and data presenting clusters with heterogeneous structures can be addressed by using cluster ensemble and multi-objective clustering approaches. In this context, we have introduced a multi-objective cluster ensemble algorithm (MOCLE), which employs simultaneously concepts from both cluster ensemble and multi-objective clustering algorithms. The idea is not only to minimize the intrinsic problems of “traditional” cluster algorithms, but also the limitations of the cluster ensemble and multi-objective clustering methods when they are used separately.

Initially, MOCLE was developed taking into account important problems in the context of bioinformatics. Currently, we are working on its extension considering, in particular, the use of techniques other than cluster ensemble as crossover operator and other types of metaheuristic techniques for the optimization problem. The goal is to have an algorithm that (1) adapts to large datasets (e.g., the extraction of knowledge from texts) and (2) always produces a concise/diverse set of solutions.

Main publications:

  • Jane Piantoni, Katti Faceli, Tiemi C. Sakata, Julio C. Pereira, and Marcílio C. P. de Souto. Impact of base partitions on multi-objective and traditional ensemble clustering algorithms. In ICONIP, volume 9489 of LNCS, pages 696–704. Springer, 2015. doi: 10.1007/978-3-319-26532-2\_77
  • Katti Faceli, Marcílio C. P. de Souto, and Andre de Carvalho. Multi-objective clustering ensemble: A framework for cluster analysis. International Journal of Soft Computing and Bioinformatics, 1:9–17, 2010.
  • Katti Faceli, Marcílio C. P. de Souto, Daniel de Araujo, and André de Carvalho. Multi-objective clustering ensemble for gene expression data analysis. Neurocomputing, 72(13-15):2763–2774, 2009. doi: 10.1016/j.neucom.2008.09.025
  • Katti Faceli, André de Carvalho, and Marcílio C. P. de Souto. Multi-objective clustering ensemble. International Journal of Hybrid Intelligent Systems, 4(3):145–156, 2007.