# THE LONDON–OXFORD TDA SEMINAR

A research seminar gathering researchers and practitioners in topological data analysis based in and around London and Oxford, jointly organized by Omer Bobrowski (QMUL), Anthea Monod (Imperial), Ximena Fernandez (City) and Vidit Nanda (Oxford).

If you would like to be added to the mailing list, please send an email to the London-Oxford TDA mailbox (londoxtda@gmail.com)

## THE SEMINAR LONDON 11 NOVEMBER 2024

## PROGRAM

9:30 - 10:00 Welcome Coffee

10:00 - 10:30 Qiquan Wang Imperial College London

A Topological Gaussian Mixture Model for Bone Marrow Morphology in Leukaemia

Acute myeloid leukaemia (AML) is a type of blood and bone marrow cancer characterized by the proliferation of abnormal clonal haematopoietic cells in the bone marrow leading to bone marrow failure. Over the course of the disease, angiogenic factors released by leukaemic cells drastically alter the bone marrow vascular niches resulting in observable structural abnormalities. We introduce a novel approach that combines topological data analysis (TDA) with statistical modelling to quantitatively analyse acute myeloid leukaemia (AML) progression using proprietary high-resolution biomedical imaging data. Our study pioneers the integration of Signed Distance Persistent Homology (SDPH), a unique variant of persistent homology, with a stage-dependent Gaussian mixture model (GMM), which allows us to interpretably quantify and model morphological changes in bone marrow vasculature across different stages of AML, and for use in prediction.

10:30 - 11:00 Parker Duncan Queen Mary University of London

Maximal k-cycles in Random Geometric Complexes

We study maximally persistent k-cycles for the Cech and Vietoris–Rips filtrations built on a Poisson process with respect to sufficiently nice density functions; maximal means that the cycle has the largest death to birth ratio. The mail goal is to show that these cycles scale, with high probability, like a constant multiple of [\log(n)/\log(\log(n))]^{1/k}], and, most importantly, this scaling is independent of the density function. In other words, the result is, in some sense, universal. The constant multiple in the limit is related to the covering number of the unit k-sphere.

11:00 - 11:30 Coffee Break

11:30 - 12:00 Francesca Tombari University of Oxford

Directed Gromov–Hausdorff Distance

The Gromov–Hausdorff distance measures the similarity between two metric spaces by isometrically embedding them into an ambient metric space. This work introduces an analogue of this distance for metric spaces endowed with directed structures. It turns out that the directed Gromov-Hausdorff distance measures the Gromov-Hausdorff distance between two spaces whose metrics are induced by the space of zigzag paths in the directed structure. Motivated by an analogy with the classical setting, we propose two alternative definitions based on distortions of directed maps and directed correspondences, which, however, turn out not to be equivalent to the initial one.

12:00 - 12:30 Abhinav Natarajan University of Oxford

Morse Theory for Chromatic Delaunay Triangulations

The chromatic alpha filtration is a generalization of the alpha filtration that can encode spatial relationships among classes of labelled point cloud data, and has applications in topological data analysis of multi-class data. In recent work with Thomas Chaplin, Adam Brown, and Maria-Jose Jimenez, we introduced the chromatic Delaunay–Cech and chromatic Delaunay–Rips filtrations, which are built on the same underlying simplicial complex but have filtration values that are easier to compute. Our main result is an application of generalised discrete Morse theory to show that the Cech, chromatic Delaunay–Cech, and chromatic alpha filtrations are related by simplicial collapses. This result generalizes a result of Bauer and Edelsbrunner from the non-chromatic to the chromatic setting. We also show that the chromatic Delaunay–Rips filtration is locally stable to perturbations of the underlying point cloud. This local stability, in conjunction with the Morse-theoretic result, means that the chromatic Delaunay-Rips filtration is a viable approximation to the chromatic alpha filtration for persistent homology calculations in low dimensional data, with the advantage of being much faster to compute. In this talk I will give a sketch of the proofs of the main results, and elaborate on how these results provide theoretical justification for the use of chromatic Delaunay–Cech and chromatic Delaunay–Rips filtrations in practical applications. I will also show the data from numerical experiments to compare the computational efficiency of the various constructions.

12:30 - 13:30 Lunch

13:30 - 14:30 Discussion Session

14:30 - 15:00 Coffee Break

15:00 - 16:00 Discussion Session

16:00 Further discussions + Pub for those still around!

## THE SEMINAR OXFORD 23 JULY 2024

## PROGRAM

9:30 - 10:00 Welcome Coffee Mezzanine Andrew Wiles Building

ALL TALKS TAKE PLACE IN L3

10:00 - 10:30 Marc Fersztand University of Oxford

Harder-Narasimhan filtrations of persistence modules: discriminating power and metric stability

The Harder-Narasimhan types are a family of discrete isomorphism invariants for representations of finite quivers. We evaluate their discriminating power in the context of persistence modules over a finite poset, including multiparameter persistence modules (over a finite grid). In particular, we introduce the skyscraper invariant and proved amongst other that it is strictly finer than the rank invariant. In order to study the stability of the skyscraper invariant, we extend its definition from the finite to the infinite setting and consider multiparameter persistence modules over Zn and Rn. We then establish an erosion-type stability result for the skyscraper invariant in this setting. This talk is based on the content of the preprints arXiv:2303.16075 (with E. Jacquard, V. Nanda and U. Tillmann) and arXiv:2406.05069.

10:30 - 11:00 Arne Wolf Imperial College

Spectral Stability of Persistent Sheaf Laplacians

It is well-known that the kernel of the graph Laplacian captures the topological properties (number of cycles and connected components) of a graph. In a similar fashion, the kernel of a persistent Laplacian captures the information contained in the persistent homology of a given simplicial complex. Our main goal is to understand what we can deduce from the remaining eigenvalues and -vectors in the more general cellular sheaf setting, which theoretically incorporate further information of the faces of a simplicial complex. In this talk, I will discuss work in progress towards this aim and present a recently-established theoretical foundation for this goal, where we show that the eigenvalues are stable under small perturbation of the sheaf and simplicial complex. The upshot of this result is that we can reasonably assume that the additional information encoded by the other eigenvalues and -vectors are a faithful representation of other geometric or topological properties of the underlying simplicial complex, although precisely what this information represents remains to be investigated (current work in progress proceeds with a machine learning approach). Joint work with Shiv Bhatia, Daniel Ruiz Cifuentes and Anthea Monod.

11:00 - 11:30 Coffee Break Mezzanine

11:30 - 12:00 Ambrose Yim Cardiff University

Classifying Cycles in Point Clouds on Manifolds via the Universal Covering

Given a simplicial complex and a map to some space, we wish to learn the induced homomorphism on the fundamental group and first homology group. For example, for point clouds in a manifold, this tasks corresponds to detecting whether non-trivial epsilon loops and 1-cycles of its Rips complex correspond to the underlying loops or homology cycles of the manifold.

If the target space admits a universal covering, then the induced homomorphism is encoded by the pullback bundle of the universal covering on the simplicial complex. We show that for (modified) Rips complexes of point clouds in model manifolds such as tori and projective spaces, the pullback bundle can be easily inferred from the metric data via the monodromy of geodesics. Having such a pullback bundle, the induced homomorphism on the fundamental group and first homology group can be inferred by a simple integration of data attached to edges.

12:00 - 12:30 Vadim Lebovici University of Oxford

Local characterization of block-decomposability for multiparameter persistence modules

I will present a recent algebraic characterization of interval-decomposability of n-parameter persistence modules for a subclass of interval summands called *block modules* obtained with Jan-Paul Lerch and Steve Oudot. This structure theorem takes the form of a *local characterization*, in the sense that checking block-decomposability can be done by examining block-decomposability of restrictions of the module to elementary axis-aligned cubes. Local conditions have been proposed in the 2-parameter setting, notably for the class of block modules, which plays a prominent role in levelset persistence. Our result generalizes these 2-parameter local conditions and, to our best knowledge, it is the first interval-decomposition result in the generality of pointwise finite-dimensional modules over finite products of arbitrary totally ordered sets.

Based on https://arxiv.org/abs/2402.16624.

12:30 - 13:30 Lunch

13:30 - 14:00 Iolo Jones Durham University

Diffusion Geometry: theory and intuition

The heat flow on a manifold encodes a huge amount of its Riemannian geometry. By replacing the heat diffusion operator with a general Markov diffusion operator on a measure space, we can define a 'Riemannian geometry' on a vastly broader class of spaces. In this talk I will discuss the intimate relationship between diffusion and geometry, and how it can form a bridge between the methods, theory, and intuition of 'pure' differential geometry and the statistical, theoretical, and computational challenges of data analysis.

14:00 - 15:15 Discussion Session

15:15 - 15:45 Coffee Break

15:45 - 17:00 Discussion Session

17:00 Pub

## THE SEMINAR LONDON 13 MARCH 2024

## PROGRAM

9:00-10:00 – Gathering & Coffee

10:00-10:30 – Mikael Vejdemo-Johansson, CUNY College of Staten Island

Highly symmetric point clouds

Point clouds with high degrees of symmetry have additional structure that may allow us to significantly speed up topological computations. In this talk I will report on work in progress on studying the persistent (co)homology pipeline on point clouds with a known (and large) group acting on them. I will describe the setup, the amount of information about the group action needed for improvements, a (partially) failed conjecture, and a 10x speedup in generating and traversing 60k simplices in the Vietoris-Rips complex of the 4-dimensional hypercube.

10:30-11:00 – Ingrid Membrillo Solis, University of Westminster

Metric geometry of spaces of persistence diagrams

From materials science to medical imaging data, persistent diagrams have proved to be a powerful tool for the analysis of complex datasets. The characterisation of the topological and geometric properties of spaces of persistence diagrams has led to the development of algorithms capable of extracting meaningful information from data. As the theory of persistent homology evolves, giving rise to more general theories, ideas for constructing and characterising spaces of generalised persistence diagrams emerge. In this talk, we will introduce a framework to construct metric spaces that generalise the spaces of persistence diagrams arising in persistent homology. We will present some results regarding the topological and geometric properties of these spaces. We will see that some of these constructions can have applications in generalised theories of persistence, such as multiparameter persistent homology.

11:00-11:30 – Coffee Break

11:30-12:00 – Luis Scoccola, University of Oxford

Structure, Stability, Computation, and Applications of Multiparameter Persistence Barcodes

The one-parameter persistence barcode gives a combinatorial representation of the rank invariant of one-parameter persistence modules, which can be efficiently computed, and which satisfies optimal transport stability results. Barcodes and their properties have been successfully used in both theoretical and applied disciplines, including symplectic geometry, functional analysis, biology, and supervised learning. When seen as a combinatorial representation of an invariant (such as the rank), barcodes can be generalized to multiparameter persistence, and several such generalizations have been considered – often known as signed barcodes. I will give an overview of joint projects on the algeraic structure, the optimal transport stability, the efficient computation, and the usage in supervised learning of signed barcodes.

12:00-12:30 – Haim Dubossarsky, Queen Mary University of London

Sectral analysis reveals new information hidden in Language Models parameters

NLP vector-based models represent the meaning of words as numeric vectors, based on the words’ co-occurrence usage statistics as reflected in natural texts. These vectors, which are actually the learned parameters of the model, are ubiquitous in everyday language technology applications, and are also the object of scientific inquiry in computational linguistic, social sciences, and other data-driven research domains. Despite significant differences in the architecture of the different neural network employed, practically all models use the same “vectorial machinery”. As a result, word vectors are typically analyzed as separate units, and their potential interactions, higher-level structures and statistical fingerprints they leave when analyzed as a bulk are often overlooked. This unnecessarily limits the potential that lies in these representations for both scientific research and language technology applications.

In this talk I will present a new framework that analyzes the entire vector space of a language, rather than focusing on individual vectors, that was employed to analyzed 70 languages. When the entire semantic space spanned by these vector representations is analyzed using their eigenvalue spectra, new information and language related features emerge. I will further introduce several similarity-isomorphism measures between two vector spaces, based on the relevant statistics of their individual spectra. I will empirically show that: (a) similarity scores derived from such spectral isomorphism measures are strongly associated with performance observed in different cross-lingual tasks; (b) these spectral-based measures consistently outperform previous standard isomorphism measures which are computed at the word level, while being computationally more tractable and easier to interpret; (c) these novel similarity-isomorphism measures capture complementary information to traditional linguistic distance measures, and the combination of measures from the two types of measures yields even better results. Overall, these findings provide an inroad to a new type of analysis, and demonstrate that richer and unique information lies beyond simple word level analysis, calling for additional methods to be employed to analyze the parameters of language models (e.g., Topology).

12:30-13:30 – Lunch

13:30-16:00 – Discussion sessions