Compared to trees, we still lack a well-developed toolkit of shape statistics that allow us to summarize, compare, and statistically analyze the structure of phylogenetic networks. Yet such statistics are essential both for gaining intuition about networks and their geometry and for developing rigorous tests of evolutionary models.
In the first part of this talk, I will introduce a new family of balance indices for phylogenetic networks, called the $H_α$ indices. This family contains the $B_2$ index as a special case, and provides a natural extension of the Sackin index to networks, while preserving its intuitive interpretation as a balance index. I will discuss the combinatorial properties of these indices and present results on their distributions under various models of random trees and networks. Additional results on these indices will be presented in a companion talk by H. Maffioli.
The second part of the talk will focus on a different statistic, inspired by the so-called ”theory of the adjacent possible”. After introducing a general class of ”exchangeable” random phylogenetic networks (which includes, among others, the Yule model, uniform ranked treechild networks and other birth–hybridization processes), I will introduce a statistic that can be used to rigorously test whether an observed network is compatible with this class of models.
This presentation focuses on some properties of a recently introduced family of balance indices: the $H_α$ indices. In this talk, we present results about the distribution of $H_α$ indices of Galton-Watson trees such as closed formulas for the expectation and variance. We then state asymptotic results for the limiting distribution of the uniform / PDA model as a consequence of a theorem characterizing the limiting distribution and moments of (the indices of) “blowups” of Galton-Watson trees.
The Sackin index is an important measure for the balance of phylogenetic trees and has been extended to level-k networks. Recently, Fuchs and Gittenberger computed explicit constants for the mean of the Sackin index of level-1 network (also known as galled trees). In level-2 networks, a biconnected component may contain up to two reticulation nodes. We investigate the Sackin index for leaf-labeled level-2 networks and determine the mean, including the explicit numerical value for the multiplicative factor, which was not known before. The method is based on generating functions and analytic combinatorics. Joint work with Bernhard Gittenberger.
10:20-10:50 | ☕ Coffee break
The total cophenetic index is a measure of tree imbalance, introduced by Mir et al. in 2013. In their paper, they calculate the minimal value of the total cophenetic index on n-leaf binary trees. We show that the total cophenetic index is related to the notion of p-adic valuation from number theory, and generalize their result to find the minimal value of this index on trees that are at most p-branching, for any prime number p.
Phylogenetic networks are widely used to model evolutionary histories that include hybridization and other reticulate processes. However, their structural complexity makes counting increasingly difficult as the number of reticulation events grows. In this work, we derive an explicit counting formula for binary phylogenetic networks with four reticulation events.
This talk will explore the combinatorial properties of two important subclasses of phylogenetic networks: galled and reticulation-visible networks.
Although these two network classes have different definitions, their structures can be compressed into simpler component graphs using the component graph method introduced by Louxin Zhang with his coauthors. By transforming component graphs into generating functions and applying analytic combinatorics, we not only derive exact formulas for a small number of reticulation vertices, but also prove a striking consistency in their asymptotic growth rates.
A ranking of a phylogenetic network is a temporal ordering of the internal vertices of the network, corresponding to a sequence of events in the evolutionary history of the lineages represented by the network. Enumeration of the possible rankings of a network can aid in evaluating the complexity of phylogenetic computations and inference algorithms. We prove an equivalence between rankings for a phylogenetic network N and a certain path-counting problem on a lattice associated with the partial order described by N. A network has “support trees” that represent tree-like subgraphs that include all its leaves; for any of the support trees, the lattice associated with the network N corresponds to a “roadblocked” portion of the lattice associated with the support tree T. We show that the enumeration of rankings for N corresponds to the enumeration of non-roadblocked paths on the lattice associated with T (for any of the support trees T). We extend our construction to ranked tree-child networks (RTCNs), addressing the problem of enumerating rankings for a given RTCN topology. Our construction introduces a novel algebraic structure into mathematical phylogenetics and provides a conceptual framework for analysis of networks through their displayed trees.
Departure
* postdoc; ** PhD students.