Projects:

  • The LiMass Project: Linear Algorithms for Massive Real-World Graphs

  • The CODDDE project aims at better understanding the evolution of real-world complex networks. Three topics are studied within this project: community structure evolution, spreading of information and detection of unexpected changes in their structure.

Softwares I've developed:

My GitHub profile is here.

C code:

  • Parallel C code for enumerating all k-cliques in a graph. The program scales to real-world networks containing several billions of edges. Friendster has exactly 487,090,833,092,739 10-cliques, that is a bit less than 0.5 quadrillions 10-cliques. Also contains C code for computing the k-clique core decomposition of a graph as well as a k-approximation of the k-clique densest subgraph. The program scales to real-world networks containing several billions of edges.

  • C implementation of algorithms to find the Density-Friendly graph decomposition.

  • A C code for computing a Heaviest k-Subgraph (that is a subgraph containing k nodes such that the sum of the weight on its edges is maximized) in a weighted graph. The program scales to real-world networks containing several billions of edges and for k up to 10, 20 or more depending on the structure of the graph. Approximated version of the problem can be solved for larger k. A stackexchange question on the topic.

  • A C code to compute all overlapping communities in a network following a from local to global approach. The program scales to networks containing several millions of nodes and several hundreds of millions of edges.

Twitter APP:

  • Application for measuring influence on Twitter taking into account social capitalism: DDPapp.

  • Application for mention recommation in order to better propagate a tweet: EasyMention.

Softwares I find useful:

  • The Louvain Method implemented in C++ by Jean-Loup Guillaume and Etienne Lefbvre as detailed in Arxiv.

  • NetworkX, a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.

  • sklearn, a Python library for Machine Learning.

  • Commercial LP, QP and SDP solvers (free for academics): MOSEK and GUROBI.

  • CVXPY, a python-embedded modeling language for convex optimization problems. It can be used with commercial solvers or free slovers like CVXOPT.

  • I use OpenMP to make my C programs parallel by only adding a few lines of code.

Datasets:

  • Two collections of large real-world graphs: networkrepository and snap.

  • Very large real-world graphs, mostly web graphs: WebGraph.

  • Machine Learning datasets repository: UCI.