Neural Network

Here let us study a supervised learning using a layered neural network. In a supervised learning, an conditional probability q(y|x) of an output y for a given input x is estimated.

bayes041.mp4

Let an input x be in a 2-dimendional Euclidean space, and y in {0,1}. An unknown probability distribution q(y|x) is estimated by a three-layered neural network. In the movie, a network which is in a posterior distribution by MCMC is displayed.

bayes042.mp4

An estimated posterior predictive distribution of a layered neural network is displayed for each sample. The generalization error is far smaller than that by estimated using AIC. In a layered neural network, the essential dimension of a learning machine is far smaller than the dimension of the parameter. This is one of the main reason why deep learning has a better predictive performance than classical regular learning machines.

A simple model selection procedure is introduced. FIve candidate models are compared according to the small generalization loss.

Generalization loss can be estimated by WAIC and LOOCV. Note that both of them are smaller than estimated by AIC.

Theoretical performance of layered neural network (deep learning) has been studied in these 25 years.

RLCTs of Layered Neural Networks.

(1) S.Watanabe. Algebraic analysis for non-regular learning machines. Advances in Neural Information Processing, vo.12, 1999.

(2) S.Watanabe. Learning efficiency of redundant neural networks in Bayesian estimation. IEEE Transactions on Neural Networks, vol.12, pp.1475-1486, 2001.

(3) M. Aoyagi, et. al. Resolution of singularities and the generalization error with Bayesian estimation for layered neural network. IEICE Trans. vo.88, pp.2112-2124, 2005.

(4) M. Aoyagi, et. al.. Stochastic complexity of reduced rank regression in Bayesian estimation. Neural Networks, Vol.18,pp.924-933, 2005.

(5) K. Yamazaki, M. Aoyagi, et. al. Asymptotic analysis of Bayesian generalization error with Newton diagram. Neural Networks. Vol.23,pp. 35-43, 2010.

(6) M. Aoyagi. Learning Coefficient of Vandermonde Matrix-Type Singularities in Model Selection. Entropy. Vol. 21, pp.561, 2019.

(7) S. Nagayasu, et. al. Bayesian Free Energy of Deep ReLU Neural Network in Overparametrized Cases. arxiv:2303.15739, 2023.

(8) S. Nagayasu, et. al. Free Energy of Bayesian Convolutional Neural Network with Skip Connection. ACML2023, PMLR, pp.927-942, 2023.

(9) M. Aoyagi. Consideration on the learning efficiency of multiple-layered neural networks with linear units. Neural Networks, 172: 106132, 2024.

Page updated

Google Sites

Report abuse