Singular Learning Theory (2)

This page is continued from Singular Learning Theory (1).

Here, we will clarify a road returning from algebraic geometry to statistical learning theory.

(1) By the resolution theorem, it is proved that the Gelfand zeta is a meromorphic function whose largest pole is the minus real log canonical threshold.

(2) The inverse Mellin transform of the zeta function is the state density function.

(3) The Laplace transform of the state density function is the partition function.

(4) Finally, we elucidate the marginal likelihood using empirical process theory.

This is the travel back from algebraic geometry to singular learning theory. Hironaka resolution theorem gives the general form of the zeta function. The inverse Mellin transform of the zeta function is the state density function. The Laplace transform of the state density function is the partition function. Lastly the probabilistic behavior of the marginal likelihood is derived from empirical process theory.

The log likelihood function can be understood as an empirical process defined on algebraic variety, which converges to a Gaussian process in distribution.

Main result is described based on the following definitions.

(1) Let {Xi} be a set of independent random variables.

(2) A model and a prior are denoted by p(x|w) and phi(w), respectively.

(3) Average and empirical minus log likelihoods are defined by these equations.

These are important observables in statistical learning theory. Here Fn is the free energy which is equal to the minus log marginal likelihood, and Gn and Cn are the generalization loss and the leave-one-out cross validation, respectively.

At last, we arrive at the general formula which clarifies universal behaviors of the free energy, generalization loss, and leave-one-out cross validation. We find the real log canonical threshold determines the most important statistical concepts. These results will be mathematical foundation of machine learning, artificial intelligence, and data science.

Lectures

Singular Learning Theory 7 Schwartz Distribution

Singular Learning Theory 8 State Density Function

Singular Learning Theory 9 Asymptotic Expansion of Singular Integral

Singular Learning Theory 10 Empirical Process

Singular Learning Theory 11 Free Energy

Singular Learning Theory 12 Generalization Loss

Singular Learning Theory 13 Cross Validation, Information Criterion, and Phase Transition

Singular Learning Theory 14 Application to Statistics

Singular Learning Theory 15 Learning Models

Page updated

Google Sites

Report abuse