Stability analysis of neural networks with persistent homology

What is a stable neural network and how to investigate it?

I assume that stable neural networks (NNs) do not output outliers while mapping the same class dataset into hidden space since the output around outliers is drastically changed, i.e., unstable. Here, the hidden space is a previous layer of the output layer. We can say that a NN is stable if we can know whether the same class dataset mapped by the NN is distributed as one lump in the hidden space. The right hand example shows an unstable NN. In this work, I proposed an investigation method of shape of the mapped dataset. The key techniques of the proposed method are persistent homology and its confidence sets estimation method. I define that the dataset is distributed on a topologically simple manifold if the shape of the dataset is one lump and the NNs are stable in such a case.

Persistent homology

Persistent homology (PH) has a major role in topological data analysis (TDA). TDA is an analysis approach of data with the concept of the topology. The bottom left figure shows a dataset distributed in a D-dimensional space. The dataset can be regarded as a point cloud in the space. Circles are set to each point and its radii are increased. In the middle left figure, a hole is appeared by connecting the circles. The radii of this time is referred to the birth. The radii are further increased and the hole is disappeared in the middle right figure. The radii of this time is referred to the death. Paris of the birth and death can be obtained from the point cloud and these pairs can be plotted on a two-dimensional surface as shown in the bottom right. The bottom right plot is referred to the persistence diagram (PD). In the proposed method, the PDs are mainly used. Note that D PDs can be obtained since the homology can be defined in from zero- to D - 1-dimensional spaces.

Topological better signals and noises

In the PDs, the birth-death pairs are plotted on up side the diagonal, i.e., y = x, since the death is is larger than the birth. As shown in the top figures, the hole is large if the birth-death pair is plotted far from the diagonal since its lifetime, i.e., death - birth, is long. Hence, such birth-death pairs are classified as better signals to characterize the topology of the dataset (Intuitively, the topology represents number of holes of a focused geometry). Birth-death pairs close to the diagonal classified as topological noises which does not well characterize the dataset.

Relation of topologically simple and one lump

Informally, the 0th, 1st, and 2nd homology groups represent numbers of connected components, holes, and voids. Here, I focused on the 0th homology group. We can say that there are two components if the better topological signals can be found in the 0th PD. In other words, we can say that the dataset is distributed on one lump if there are no better signals and this case is defined as topologically simple. Hence, the birth-death pairs have to be correctly distinguished as the better signals or noises. However, there are no better criterions. The proposed method uses the confidence sets estimation method of the PDs to set the criterions.

Methods

I referred works by


Publication

The method presented in this page is proposed in [1].

[1] Naoki Akai, Takatsugu Hirayama, and Hiroshi Murase. "Experimental stability analysis of neural networks in classification problems with confidence sets for persistence diagrams,." Neural Networks, 2021 (accepted). (ResearchGate)