Infra-data analysis: blog #2
Main concepts and definitions

Main concepts and referneces


We will need some additional concepts below for learning data- topology building it based on the fundamental definitions from the manifold theory and algerbaic geometry.
Definition of Bernstein algebras A for the graph G can be found in Grishkov, Costa work. This concept will be specifically useful for us to extend it for hypergraphs H being more general instances than graphs.

The important aspect of Bernstein algebra are as follows. Firstly we conclude that 1. it is intrinsically does not have the limitations of being associative; 2. it can be associated with the graph (hypergraph) structures, making a new encoding link between algebraic and geometric structures. 


We will be specifically interested here in how the results from [Grishkov, Costa] for isomorphic graphs having the isomorphic Bernstein algebras can be generalized to hypergraphs case.

Methods for computational analysis of data

Currently there are various methods for embedding of data available, from UMAP to tSNE. They all are various and some contain large amount of hyperparameters. Here in the analysis we propose we aim at finding the suitable metric for various datasets, based on pure properties of the data itself. As to our knowledge there has not been a lot of research done on this before.

However we can read about research on UMAP:

The pure red channel correctly sees the data as living on a one dimensional manifold, the hue metric interprets the data as living in a circle, and the HSL metric fattens out the circle according to the saturation and lightness. This provides a reasonable demonstration of the power and flexibility of UMAP in understanding the underlying topology of data, and finding a suitable low dimensional representation of that topology.

More can be found in https://umap-learn.readthedocs.io/en/latest/parameters.html