Séance du 23 juin 2014

Séance organisée par Stéphane Gaïffas et Pierre Latouche.

Lieu : IHP, Amphithéâtre Darboux.

14h00 : Nino Shervashidze (projet INRIA SIERRA, INRIA et ENS)

Titre : Representing graphs for machine learning

Résumé : There is a growing need for adapting data analysis and machine learning methods to data sets of graphs, arising in different fields such as chemistry, biology, natural language processing, or image processing. Defining an implicit or explicit feature representation for graphs makes it possible to apply a wide spectrum of machine learning algorithms to such data. These representations have to respect the structure and node/edge labels of graphs and, importantly, they have to be efficient to compute in order to be applicable to large graphs. This talk will give an overview of different graph representation methods, with a focus on the Weisfeiler-Lehman family of kernels for graphs with discrete node labels, particular instances of which scale only linearly in the number of edges of the graphs, and have remained state-of-the-art on graph classification benchmark data sets in terms of accuracy and runtime.

15h00 : Charles Bouveyron (MAP5, Université Paris Descartes)

Titre : The random subgraph model for the analysis of an ecclesiastical network in Merovingian Gaul

Résumé : In the last two decades many random graph models have been proposed to extract knowledge from networks. Most of them look for communities or, more generally, clusters of vertices with homogeneous connection profiles. While the first models focused on networks with binary edges only, extensions now allow to deal with valued networks. Recently, new models were also introduced in order to characterize connection patterns in networks through mixed memberships. This work was motivated by the need of analyzing a historical network where a partition of the vertices is given and where edges are typed. A known partition is seen as a decomposition of a network into subgraphs that we propose to model using a stochastic model with unknown latent clusters. Each subgraph has its own mixing vector and sees its vertices associated to the clusters. The vertices then connect with a probability depending on the subgraphs only, while the types of edges are assumed to be sampled from the latent clusters. A variational Bayes expectation-maximization algorithm is proposed for inference as well as a model selection criterion for the estimation of the cluster number. Experiments are carried out on simulated data to assess the approach. The proposed methodology is then applied to an ecclesiastical network in Merovingian Gaul. An R code, called Rambo, implementing the inference algorithm is available from the authors upon request. This is joint work with L. Jegou, Y. Jernite, S. Lamassé, P. Latouche & P. Rivera.

Références : C. Bouveyron, L. Jegou, Y. Jernite, S. Lamassé, P. Latouche & P. Rivera, The random subgraph model for the analysis of an ecclesiastical network in merovingian Gaul, The Annals of Applied Statistics, vol. 8(1), pp. 377-405, 2014

16h00 : Michalis Vazirgiannis (LIX, Ecole Polytechnique)

Titre : Graph Degeneracy for Social Networks Exploration

Résumé : Graph degeneracy -- an efficient hierarchical decomposition of a graph into subgraphs of monotonous connectivity and coherence properties -- constitutes a powerful tool in the analysis and exploration of real-world networks. In our research, we have explored the potential of graph degeneracy to several frontiers. By extending this concept into networks where the connections bare non-trivia semantics (e.g., a weighted degree of preference or trust relationships among individuals), we create new evaluation metrics for the collaboration capabilities of individuals in social interaction graphs. We also produce a hierarchically nested ordering of the network for management, exploration and clustering analysis through a graph decomposition process based on degeneracy. Furthermore, we have examined the engagement dynamics of social graphs, i.e., the extend that an individual is encouraged to participate in the activities of a community, proposing interesting degeneracy-based measures at both node (local) level as well as at graph (global) level. Finally, we have applied our theoretical models and metrics to large real-world social networks (e.g., snapshots of the Facebook social network, scientific collaboration and citation graphs), producing interesting results.