( Note) The book published in 1986,
J. L. McClelland, D. E. Rumelhart, and G. E. Hinton, Parallel Distributed Processing, MIT Press,
describes hierarchical neural networks and backpropagation as a learning method. The question at the time was, when such a model learns data from the external world, how external information is represented in its internal parameters. In fact , the book describes the types of networks formed for various tasks. (Professors Rumelhardt and McClelland are cognitive psychologists; Professor Hinton won the Nobel Prize for his research in this field.) Soon after the book's publication, experiments demonstrated that this model exhibited high performance in a variety of computer engineering problems, sparking a second neuro-boom. After several twists and turns, modern artificial intelligence was realized. However, even 40 years later, there has been little progress in research on the original challenge of understanding the internal representation of neural networks. It is my hope that singular learning theory will be the first step toward tackling this difficult challenge.
(Note) An analytic set is the set formed by all the zeros of an analytic function. The set formed by all the zeros of a polynomial is called an algebraic set or algebraic variety.
(Note) When calculating eigenvalues numerically, you may get something like 1.0 x 10^(-20), and it's unclear whether it's 0 or not, but in learning theory, one guideline is whether the value becomes 1 when multiplied by n. If you're considering higher-order asymptotics, try multiplying by n^2.
(Note) The terms learning theory and learning curve were originally coined in psychology to study the learning process in humans and animals. These terms have since been adapted to apply to artificial neural networks and continue to be used today. The term machine learning also has the same origin.
(Note) No knowledge of physics is required here, but the phase transitions explained here have a mathematically equivalent structure to the phenomena of "water turning into ice" and "iron becoming a magnet." (In a state of thermal equilibrium, the phase is determined by minimizing free energy. The phase selected can change to minimize free energy due to changes in temperature, magnetic field, etc., and this is a phase transition.) Deep learning is realized by computers and is not a natural phenomenon, but this means that the mathematical mechanisms are the same as natural phenomena. It is not yet clear whether biological neural circuits exhibit phenomena similar to the phase transitions explained here. When you come up with something new, what phenomenon occurs in your brain's neural circuits?
(Note) It is an illusion to think that humans can express their values precisely with a certain function.