Ensemble clustering methods for the analysis of patterns in bio-molecular data

Leveraging the results obtained with stability based methods to discover patterns and structures in complex biomolecular data, we developed unsupervised ensemble methods based on random projections to analyze data characterized by a high dimensionality (Bertoni and Valentini, 2006) such as gene expression data (Bertoni and Valentini, 2007).

In a next step we developed a fuzzy variant of the ensemble clustering approach to model the uncertainty underlying biomolecular data. We introduce a fuzzy approach for both the base clusterings of the ensemble and to combine the clusterings obtained from multiple instances of the projected data (Avogadri and Valentini, 2007). From this, a more general algorithmic scheme has been developed, from which different fuzzy ensemble clustering algorithms can be derived (Avogadri and Valentini, 2008). Some of these fuzzy ensemble algorithms have been applied to the analysis of gene expression data to discover subclasses of pathologies at bio-molecular level (Avogadri and Valentini, 2009).


R. Avogadri, G.Valentini, Fuzzy ensemble clustering based on random projections for DNA microarray data analysis Artificial Intelligence in Medicine 45(2), pp. 173-183, 2009

R. Avogadri, G.Valentini, Ensemble Clustering with a Fuzzy Approach, in: "Supervised and Unsupervised Ensemble Methods and their Applications", Studies in Computational Intelligence, vol. 126, Springer, 2008.

R. Avogadri, G.Valentini, Fuzzy ensemble clustering for DNA microarray data analysis, CIBB 2007, The Fourth International Conference on Bioinformatics and Biostatistics, Lecture Notes in Computer Science, vol. 4578, pp.537-543, 2007

A.Bertoni, G.Valentini, Randomized Embedding Cluster Ensembles for gene expression data analysis, SETIT 2007 - IEEE International Conf. on Sciences of Electronic, Technologies of Information and Telecommunications, Hammamet, Tunisia, 2007.

A.Bertoni, G. Valentini, Ensembles Based on Random Projections to Improve the Accuracy of Clustering Algorithms, Neural Nets, WIRN 2005, Lecture Notes in Computer Science, vol. 3931, pp. 31-37, 2006.