Regularized Markov Clustering and Variants
This page contains of some of the main variants of Regularized Markov Clustering developed by members of the Data Mining Research Laboratory at the Ohio State University. Markov Clustering (MCL) is an unsupervised clustering algorithm for graphs that relies on the principle of stochastic flows. For details on the original MCL algorithm developed by S-v. Dongen please refer to this page.
Algorithms based on the simulating of stochastic flows are a simple and natural solution for the problem of clustering graphs but as had been noted by several researchers in the data mining community their widespread use has been hampered by their lack of scalability and fragmentation of output. We present regularized variants of the original algorithm that help redress these limitations allowing such algorithms to scale to very large graphs while still retaining the key theoretical properties underpinning the original algorithm. Below we list the key papers, and associated source code (released open source under a BSD license) of the regularized variants of MCL (R-MCL). The work described was supported by various grants (for example from the NSF, DOE and NIH and industrial partners) as noted in the individual papers. A short video presentation on the initial MLR-MCL idea can be found here. A recent broader overview, covering some of the other ideas, can be found here.
- Scalable Graph Clustering Using Stochastic Flows: Applications to Community Discovery, V. Satuluri and S. Parthasarathy, ACM SIGKDD 2009. PDF. (source code here: filename: mlrmcl1.2.tar.gz)
- Markov Clustering of Protein Interaction Networks with Improved Balance and Scalability, V. Satuluri, S. Parthasarathy and D. Ucar, ACM BCB 2010. PDF. (source code here: filename: mlrmcl1.2.tar.gz)
- Identifying Functional Modules in Interaction Networks through Overlapping Markov Clustering, Y-K. Shih and S. Parthasarathy, Bioinformatics (ECCB Supplement), 2012. PDF. (source code here: filename: ECCB2012_SR-MCL.tar.gz)
- A Fast Implementation of the MLR-MCL Algorithm on Multi-core Processors, HiPC, 2014. PDF (source code here: filename: nrmcl.zip)
- Component Detection in Directed Networks, Y-K. Shih et. al. CIKM 2014. PDF (source: awaiting public release, pre-release version available upon request)