Ensemble Methods for clustering and co-clustering (EMClust)
Workshop of the international conference on data mining (ICDM'13)
December 7, 2013
Cluster analysis is an important tool in a variety of scientific areas including pattern recognition, document clustering and information retrieval. Although many clustering procedures such as hierarchical clustering and k-means aim to construct an optimal partition of objects or, sometimes, variables, there are other methods, known as co-clustering clustering methods or latent block models, which consider the two sets simultaneously and organize the data into homogeneous blocks. Compared with the classical clustering algorithms, co-clustering algorithms have been shown to be more effective in discovering hidden clustering structures in the data matrix.
In recent years co-clustering has become an important challenge in market-basket analysis, text mining, microarrays and recommendation system analysis. For instance, in document clustering it makes use of the clear duality between rows (documents) and columns (words). In the analysis of microarray data, where data are often presented as matrices of expression levels of genes under different conditions, co-clustering of genes and conditions has overcome the problem encountered in conventional clustering methods of the choice of similarity on the two sets. Then exploiting the underlying twosided data structure could help the simultaneous clustering, leading to meaningful gene and experimental condition clusters.
In the last ten years, several works have shown that the ensemble approach could be useful for unsupervised learning. Now, the ensemble approach Clustering ensembles has been known as an effective method to improve the robustness and stability of clustering analysis. These methods are based on two crucial steps: (1) The construction of an accurate and diverse ensemble of clustering solutions. (2) The combination of all of these clustering solutions into only one in using a consensus function. A wide variety of procedures have been proposed for these two step based on diverse theories. This workshop intends to provide a forum for researchers in the field of Machine Learning and Data Mining to discuss the above and other related topics regarding the ensemble clustering and its applications.
Topics of interest include but not limited to:
· Consensus functions
· Resampling procedures
· Collaborative filtering
· Boolean matrix factorization
· Nonnegative Matrix Factorization
· Latent Block Models
· Model selection
· Sub-space clustering
· Constraint co-clustering
· Neighbor-based approaches
· Low rank decomposition
· Microarray data analysis
· Frequent pattern mining
· Multi-view co-clustering
· Large and high dimensionality
extended deadline: August 14, 2013: Due date for full workshop papers
September 24, 2013: Notification of paper acceptance to authors
December 7, 2013: Day of workshop
Paper submissions should be limited to a maximum of *8* pages, and follow the IEEE ICDM format requirem.
ICDM website submission site: https://wi-lab.com/cyberchair/2013/icdm13/scripts/ws_submit.php
Fred (Technical University of Lisbon)
Blaise Hanczar, Associate professor (Paris Descartes University, France).
Mohamed Nadif, Full Professor (Paris Descartes University, France).
Nicoleta Rogovschi, Associate professor (Paris Descartes University, France).
Blaise Hanczar : firstname.lastname@example.org