Papers

Henri Riihimäki and José Licón-Saláiz, Metrics for Learning in Topological Persistence

Abstract. Persistent homology analysis provides means to capture the connectivity structure of data sets in various dimensions. On the mathematical level, by defining a metric between the objects that persistence attaches to data sets, we can stabilize invariants characterizing these objects. We outline how so called contour functions induce relevant metrics for stabilizing the rank invariant. On the practical level, the stable ranks are used as fingerprints for data. Different choices of contour lead to different stable ranks and the topological learning is then the question of finding the optimal contour. We outline our analysis pipeline and show how it can enhance classification of physical activities data. As our main application we study how stable ranks and contours provide robust descriptors of spatial patterns of atmospheric cloud fields.

José Licón-Saláiz and Cedrick Ansorge, Topological descriptors of spatial coherence in a convective boundary layer

Abstract. The interaction between a turbulent convective boundary layer (CBL) and the underlying land surface is an important research problem in the geosciences. In order to model this interaction adequately, it is necessary to develop tools which can describe it quantitatively. Commonly employed methods, such as bulk flow statistics, are known to be insufficient for this task, especially when land surfaces with equal aggregate statistics but different spatial patterns are involved. While geometrical properties of the surface forcing have a strong influence on flow structure, it is precisely those properties that get neglected when computing bulk statistics. Here, we present a set of descriptors based on low-level topological information (i. e. connectivity), and show how these can be used both in the structural analysis of the CBL and in modeling its response to differences in surface forcing. The topological property of connectivity is not only easier to compute than its higher-dimensional homological counterparts, but also has a natural relation to the physical concept of a coherent structure.

Marco Guerra, Alessandro De Gregorio, Giovanni Petri, and Francesco Vaccarino, Principled Network Skeletonization via Minimal Homology Bases

Abstract. The homological scaffold leverages persistent homology to construct a topologically sound summary of a weighted network. However, its crucial dependency on the choice of representative cycles hinders the ability to trace back global features onto individual network components, unless one provides a principled way to make such a choice. In this paper we apply recent advances in the computation of minimal homology bases to introduce a quasi-canonical version of the scaffold, called minimal, and employ it to analyze data both real and in silico. At the same time we verify that, statistically, the standard scaffold is a good proxy of the minimal one for sufficiently complex networks.

Jeremy Charlier, Francois Petit, Gaston Ormazabal, Radu State and Jean Hilger, Visualization of AE’s Training on Credit Card Transactions with Persistent Homology

Abstract. Auto-encoders are among the most popular neural network architecture for dimension reduction. They are composed of two parts: the encoder which maps the model distribution to a latent manifold and the decoder which maps the latent manifold to a reconstructed distribution. However, auto-encoders are known to provoke chaotically scattered data distribution in the latent manifold resulting in an incomplete reconstructed distribution. Current distance measures fail to detect this problem because they are not able to acknowledge the shape of the data manifolds, i.e. their topological features, and the scale at which the manifolds should be analyzed. We propose Persistent Homology for Wasserstein Auto-Encoders, called PHom-WAE, a new methodology to assess and measure the data distribution of a generative model. PHom-WAE minimizes the Wasserstein distance between the true distribution and the reconstructed distribution and uses persistent homology, the study of the topological features of a space at different spatial resolutions, to compare the nature of the latent manifold and the reconstructed distribution. Our experiments underline the potential of persistent homology for Wasserstein Auto-Encoders in comparison to Variational Auto-Encoders, another type of generative model. The experiments are conducted on a real-world data set particularly challenging for traditional distance measures and auto-encoders. PHom-WAE is the first methodology to propose a topological distance measure, the bottleneck distance, for Wasserstein Auto-Encoders used to compare decoded samples of high quality in the context of credit card transactions.

Naheed Anjum Arafat, Debabrota Basu and Stephane Bressan, ε-net Induced Lazy Witness Complexes on Graphs

Abstract. Computation of persistent homology of simplicial representations such as the Rips and the Cěch complexes do not efficiently scale to large point clouds. It is, therefore, meaningful to devise approximate representations and evaluate the trade-off between their efficiency and effectiveness. The lazy witness complex economically defines such a representation using only a few selected points, called landmarks.Topological data analysis traditionally considers a point cloud in a Euclidean space. In many situations, however, data is available in the form of a weighted graph. A graph along with the geodesic distance defines a metric space. This metric space of a graph is amenable to topological data analysis. We discuss the computation of persistent homologies on a weighted graph. We present a lazy witness complex approach leveraging the notion of ε-net that we adapt to weighted graphs and their geodesic distance to select landmarks. We show that the value of the parameter of the ε-net provides control on the trade-off between choice and number of landmarks and the quality of the approximate simplicial representation. We present three algorithms for constructing an ε-net of a graph. We comparatively and empirically evaluate the efficiency and effectiveness of the choice of landmarks that they induce for the topological data analysis of different real-world graphs.