Singular Learning Theory (1)

Statistical models and learning machines which have hidden variables or hierarchical structures have singularities in their parameter spaces. To study such models and machines, first, we go to algebraic geometry which seems to be far far away from statistical learning theory. 

Second, we take an extremely difficult journey back from algebraic geometry to statistical learning theory. At that time, a new field called singular learning theory will be developed. The path passes through resolution theorem, zeta functions, Schwartz distributions, state density functions, partition functions, empirical process theory, and finally reaches free energy and generalization loss.


Why algebraic geometry ?

In deep learning,  a  statistical sub model corresponds to an algebraic variety in the parameter space. Statistical properties of such models can be captured by two birational invariants. This is the reason why algebraic geometry is necessary to understand deep learning process.

What is the problem ?

In classical regular models, KL divergence can be approximated by a quadratic form, whereas, in modern singular models, it cannot because of singularities.

Although singularities make the generalization errors quite small, it has been difficult to analyze its mathematical properties. 

Basic Answer by Algebraic Geometry

The problem in machine learning theory caused by singularities is solved by the basic theorem in algebraic geometry. For the statement and concrete example, please see Hironaka resolution theorem. An arbitrary singularity can be made normal crossing on an appropriate manifold by using a birational transform. This theorem is the most basic and important one in algebraic geometry proved in 1964.  

The learning process is captured by two birational invariants, real log canonical threshold and singular fluctuation, both of which can be explicitly defined by using the resolution theorem. You will be able to understand the fact that these two mathematical concepts play the central role in machine learning theory. 



If you are interested in this page, please visit,

S. Watanabe, Algebraic geometry and statistical learning theory, Cambridge University Press, 2009. 

slt202301.pdf

Singular Learning Theory 01

slt202302.pdf

Singular Learning Theory 02

slt202303.pdf

Singular Learning Theory 03

slt202304.pdf

Singular Learning Theory 04

slt202305.pdf

Singular Learning Theory 05

slt202306.pdf

Singular Learning Theory 06

This page is continued to Singular Learning Theory (2)