The main idea of the course is to give an overview of various methods, which are being used for dealing with high-dimensional data and which issues one may encounter when projecting this higher order information from high-dimensional space to lower dimensions.
Parts of the course:
Part 1. Embeddings theory:
State of the art algorithms for manifold learning in data (based on material from Meraz lecture notes)
Main algorithms being used for dimensionality reduction: graph based algorithms
Part 2. Hypergraph theory:
from ternary structures to higher-order structures
Hypergraph theory: motifs analysis and statistical validation of hypothesis
Hypergraph theory: motifs rewriting and computational complexity of processing hypergraphs
Part 3. BERT embeddings analysis (related to blog posts here)
Quick introduction to embedding methods from textual information
BERT base, which is a BERT model consists of 12 layers of Transformer encoder, 12 attention heads, 768 hidden size, and 110M parameters.
What exactly BERT encoder layer does?
Part 4.
Designing alternative algorithms and high-dimensional feature space (tSNE, UMAP as examples of non-linear methods)
Some of the literature:
1. M. Capobianco and J. Molluzo, Examples and Counterexamples in Graph Theory, North Holland, Amsterdam (1978).
2. R. Costa and J. H. Guzzo, "A class of exceptional Bernstein algebras associated to graphs," Comm. Alg. 25, No. 7, 2129-2139 (1997).
3. Yu. I. Lyubich, Mathematical Structure in Population Genetics, Biomathematics, 22, Springer, Berlin- Heidelberg-New York (1992).
4. R. Diestel Graph theory