Projects:
The LiMass Project: Linear Algorithms for Massive Real-World Graphs
The CODDDE project aims at better understanding the evolution of real-world complex networks. Three topics are studied within this project: community structure evolution, spreading of information and detection of unexpected changes in their structure.
Softwares I've developed:
My GitHub profile is here.
C code:
Parallel C code for enumerating all k-cliques in a graph. The program scales to real-world networks containing several billions of edges. Friendster has exactly 487,090,833,092,739 10-cliques, that is a bit less than 0.5 quadrillions 10-cliques. Also contains C code for computing the k-clique core decomposition of a graph as well as a k-approximation of the k-clique densest subgraph. The program scales to real-world networks containing several billions of edges.
C implementation of algorithms to find the Density-Friendly graph decomposition.
A C code for computing a Heaviest k-Subgraph (that is a subgraph containing k nodes such that the sum of the weight on its edges is maximized) in a weighted graph. The program scales to real-world networks containing several billions of edges and for k up to 10, 20 or more depending on the structure of the graph. Approximated version of the problem can be solved for larger k. A stackexchange question on the topic.
A C code to compute all overlapping communities in a network following a from local to global approach. The program scales to networks containing several millions of nodes and several hundreds of millions of edges.
Twitter APP:
Application for measuring influence on Twitter taking into account social capitalism: DDPapp.
Application for mention recommation in order to better propagate a tweet: EasyMention.
Softwares I find useful:
The Louvain Method implemented in C++ by Jean-Loup Guillaume and Etienne Lefbvre as detailed in Arxiv.
NetworkX, a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
sklearn, a Python library for Machine Learning.
Commercial LP, QP and SDP solvers (free for academics): MOSEK and GUROBI.
CVXPY, a python-embedded modeling language for convex optimization problems. It can be used with commercial solvers or free slovers like CVXOPT.
I use OpenMP to make my C programs parallel by only adding a few lines of code.
Datasets:
Two collections of large real-world graphs: networkrepository and snap.
Very large real-world graphs, mostly web graphs: WebGraph.
Machine Learning datasets repository: UCI.