Homophily within and across groups
Abbas K. Rizi, Riccardo Michielan, Clara Stegehuis, Mikko Kivelä. ArXiv:2412.07901 [physics.soc-ph], May 2025.
Homophily-the tendency to connect with similar others-plays a central role in shaping network structure and function. It is often treated as a uniform, global parameter, independent from other structural features. Here, we propose a maximum-entropy framework that decomposes homophily into contributions within and across groups, with the stochastic block model emerging as a special case. Our exponential-family formulation, parameterized by group size, fits real-world social networks well and allows homophily to be captured with a single parameter per group size. This decomposition shows that aggregate metrics often obscure group-level variation. We also find that the scale-dependent distribution of homophily has a significant impact on network percolation, influencing epidemic thresholds, the spread of ideas or behaviors, and the effectiveness of intervention strategies. Ignoring this heterogeneity can lead to distorted conclusions about connectivity and dynamics in complex systems.
Optimal network geometry detection for weak geometry
Riccardo Michielan, Clara Stegehuis. ArXiv:2502.08407 [physics.soc-ph], Feb 2025.
Network geometry, characterized by nodes with associated latent variables, is a fundamental feature of real-world networks. Still, when only the network edges are given, it may be difficult to assess whether the network contains an underlying source of geometry. This paper investigates the limits of geometry detection in networks in a wide class of models that contain geometry and power-law degrees, which include the popular hyperbolic random graph model. We specifically focus on the regime in which the geometric signal is weak, characterized by the inverse temperature 1/T < 1. We show that the dependencies between edges can be tackled through Mixed-Integer Linear Problems, which lift the non-linear nature of network analysis into an exponential space in which simple linear optimization techniques can be employed. This approach allows us to investigate which subgraph and degree-based statistic is most effective at detecting the presence of an underlying geometric space. Interestingly, we show that even when the geometric effect is extremely weak, our Mixed-Integer programming identifies a network statistic that efficiently distinguishes geometric and non-geometric networks.
Geometric random intersection graphs with general connection probabilities
Maria Deijfen, Riccardo Michielan. Journal of Applied Probability, Volume 61, Issue 4, December 2024.
Let V and U be the point sets of two independent homogeneous Poisson processes on R^d. A graph G_V with vertex set V is constructed by first connecting pairs of points ( v , u ) with v ∈ V and u ∈ U independently with probability g( v − u ), where g is a non-increasing radial function, and then connecting two points v1, v2 ∈ V if and only if they have a joint neighbor u∈U. This gives rise to a random intersection graph on R^d. Local properties of the graph, including the degree distribution, are investigated and quantified in terms of the intensities of the underlying Poisson processes and the function g. Furthermore, the percolation properties of the graph are characterized and shown to differ depending on whether g has bounded or unbounded support.
Optimal subgraphs in geometric scale-free random graphs
Riccardo Michielan, Clara Stegehuis, Matthias Walter. ArXiv:2404.14972 [math.pr], April 2024.
Geometric scale-free random graphs are popular models for networks that exhibit as heavy-tailed degree distributions, small-worldness and high clustering. In these models, vertices have weights that cause the heavy-tailed degrees and are embedded in a metric space so that close-by groups of vertices tend to cluster. The interplay between the vertex weights and positions heavily affects the local structure of the random graph, in particular the occurrence of subgraph patterns, but the dependencies in these structures and weights make them difficult to analyze. In this paper we investigate subgraph counts using a \textit{divide et impera} strategy: first counting the number of subgraphs in specific classes of vertices; then computing which class yields maximum contribution. Interestingly, the scaling behavior of induced and general subgraphs in such geometric heavy-tailed random graphs is closely related to the solution of a mixed-integer linear program which also shows that subgraphs appear predominantly on vertices with some prescribed degrees and inter-distances. Finally, we derive precise asymptotics for trees and Hamiltonian subgraphs.
Localized geometry detection in scale-free random graphs
Gianmarco Bet, Riccardo Michielan, Clara Stegehuis. ArXiv:2206.01553 [physics.soc-ph], June 2022.
We consider the problem of detecting whether a power-law inhomogeneous random graph contains a geometric community, and we frame this as an hypothesis testing problem. More precisely, we assume that we are given a sample from an unknown distribution on the space of graphs on n vertices. Under the null hypothesis, the sample originates from the inhomogeneous random graph with a heavy-tailed degree sequence. Under the alternative hypothesis, k=o(n) vertices are given spatial locations and connect between each other following the geometric inhomogeneous random graph connection rule. The remaining n−k vertices follow the inhomogeneous random graph connection rule. We propose a simple and efficient test, which is based on counting normalized triangles, to differentiate between the two hypotheses. We prove that our test correctly detects the presence of the community with high probability as n→∞, and identifies large-degree vertices of the community with high probability.
Detecting hyperbolic geometry in networks: why triangles are not enough
Riccardo Michielan, Nelly Litvak, Clara Stegehuis. Physical Review E 106, November 2022.
In the past decade, geometric network models have received vast attention in the literature. These models formalize the natural idea that similar vertices are likely to connect. Because of that, these models are able to adequately capture many common structural properties of real-world networks, such as self-invariance and high clustering. Indeed, many real-world networks can be accurately modeled by positioning vertices of a network graph in hyperbolic spaces. Nevertheless, if one observes only the network connections, the presence of geometry is not always evident. Currently, triangle counts and clustering coefficients are the standard statistics to signal the presence of geometry. In this paper we show that triangle counts or clustering coefficients are insufficient because they fail to detect geometry induced by hyperbolic spaces. We therefore introduce a novel triangle-based statistic, which weighs triangles based on their strength of evidence for geometry. We show analytically, as well as on synthetic and real-world data, that this is a powerful statistic to detect hyperbolic geometry in networks.
Cliques in geometric inhomogeneous random graphs
Riccardo Michielan, Clara Stegehuis. Journal of Complex Networks, Volume 10, Issue 1, February 2022.
Many real-world networks were found to be highly clustered and contain a large amount of small cliques. We here investigate the number of cliques of any size k contained in a geometric inhomogeneous random graph: a scale-free network model containing geometry. The interplay between scale-freeness and geometry ensures that connections are likely to form between either high-degree vertices, or between close by vertices. At the same time, it is rare for a vertex to have a high degree, and most vertices are not close to one another. This trade-off makes cliques more likely to appear between specific vertices. In this article, we formalize this trade-off and prove that there exists a typical type of clique in terms of the degrees and the positions of the vertices that span the clique. Moreover, we show that the asymptotic number of cliques as well as the typical clique type undergoes a phase transition, in which only k and the degree-exponent τ are involved. Interestingly, this phase transition shows that for small values of τ, the underlying geometry of the model is irrelevant: the number of cliques scales the same as in a non-geometric network model.
The influence of geometry on scale-free random graphs
Riccardo Michielan, December 2024.
PhD Thesis. Supervisor: Marc Uetz, Clara Stegehuis
This thesis investigates the influence of geometry on scale-free networks, focusing on Geometric Inhomogeneous Random Graphs (GIRGs) and related models.
Part I examines scaling and concentration of subgraphs in GIRGs. Our results reveal a phase transition in clique counts driven by the interaction between geometry and scale-free properties. For large values of the scale-free parameter $\tau$, we identify a geometric regime, where cliques primarily form between nearby vertices and grow linearly with graph size. When $\tau$ is small, a non-geometric regime emerges, dominated by cliques among high-degree vertices. This geometric/non-geometric distinction holds across various subgraph types, with broader results obtained through a mixed-integer linear programming approach that balances entropy and energy for vertex subsets with specific degrees and distances.
Part II addresses the challenge of identifying latent geometry in scale-free networks. We introduce weighted triangles as a novel network statistic that effectively detects geometric structure by discounting triangles among high-degree nodes. Unlike pure triangle counts, which fail in detecting geometry when $\tau$ is small, weighted triangles grow linearly in GIRGs but remain bounded in non-geometric models, providing a reliable geometric test. Empirical studies demonstrate its effectiveness on real-world networks, where we also apply weighted triangles to a community detection problem, achieving partial recovery of geometric communities and developing an estimator for community size.
Part III defines Spatial Random Intersection Graphs (SRIGs), an AB model with spatially embedded individuals and groups connected by radially decreasing probabilities. We derive connection probabilities, bounds on the degree distribution (showing it is not heavy-tailed), and analyze percolation. Our results reveal a percolation phase transition that depends on whether the connection function's support is bounded or unbounded, highlighting the importance of individual and group intensities in network connectivity.
An algorithmic approach to the random spanning forests
Riccardo Michielan, October 2020.
MSc Thesis. Supervisor: Alessandra Bianchi.
Given a graph it is possible to extract a subgraph without cycles, that is a tree, from it. Iterating this procedure we obtain a forest spanning the graph. How can we sample uniform or, more generally, weighted spanning forests? The answer follows from the works of D. Wilson [14] who provided in the 1990’s an algorithm based on loop-erased random walks. This method turned out to be amazingly easy to perform, but at the same time extremely powerful, since it brings out the links between the study of random spanning forests and the theory of Markov chains. In this thesis, we collect the basic results of this approach to the random spanning forests and we focus on some interesting consequences in terms of network analysis.
You can also find my works on