Codes

From this page, you can have a look at my research and download some codes (written in C++). Each folder contains a ReadMe file with instructions to compile the code and use the program.

Any feedback is very welcome.

Topic Modeling

All the codes we used in our project about networks and text documents.

A not so small collection of clustering methods (+ Cvis + Consensus Clustering)

This is a selected collection of clustering algorithms for networks, along with the consensus clustering method which we proposed in this paper. The program can also find a consensus among partitions computed from different networks.

The package contains the Modularity Optimization Method (Simulated Annealing), Infomap (also hierarchical), OSLOM, the Louvain Method, and the Label Propagation Method. They all support the same format file (edge list). Moreover, the package provides a relatively simple algorithm for visualization (cvis), ready to be opened by gephi. The ReadMe will tell you more. Also, I would be glad to include new methods if you like to suggest any.

Credits: If you use any of these algorithms, please make sure to cite the proper paper (I am not the author of most of them!). References are here. The codes are all written by me apart from the several Infomaps, which are by Martin Rosvall. All the codes are free to use for noncommercial purposes.

OSLOM

OSLOM means Order Statistics Local Optimization Method and it is our new community detection algorithm. Last version (2.3) was updated on February the 3rd, 2011.

We present the method in this paper.

Cluster Visualization (source code)

This program reads a network and certain hierarchical overlapping modules and produces a gdf file which can be processed by gephi. Here you can have a look at some examples.

Mutual information for groupings (including source code)

This program is to compute the mutual information of two covers. Given a set of elements, a partition is a union of subsets which are non-overlapping and which cover the set. Instead, a cover is just a collection of subsets. The program computes a measure to quantify how two covers are similar. The measure is described in the Appendix B of this paper.

Benchmarks (source codes)

These programs can be used to generate benchmark graphs. They are all described here. We used them to compare the performances of quite a number of algorithms (go here if you would like to have a look). I also added the code for hierarchical benchmark graphs: this produces binary networks with a three level hierarchical structure: nodes are organized in micro-modules which are organized in macro-modules. Finally, you can find the code for multiplexes here, which also includes the code for computing the Normalized Mutual Information.

c and b-score (source code)

This program computes what we call c-score and the b-score, two measures to quantify the statistical significance of a cluster in a network, compared to the configuration model. Here you can find the paper where we describe how these are defined and some of their properties.

OSLOM is based on this score, but the test is slightly different, since we focused on a simpler statistical indicator which is also much faster to compute. If you would like to try the new test, you can download this program, which is basically an updated version of the b-score. The results are close but not equivalent because although the null model is the same one, the test statistics is not.