Session 1: History (Feb 19th)
Table of contents
- Brief history of the subject
- What is the perceptron algorithm?
- What is known about the perceptron algorithm?
- How does this relate to modern neural networks?
- What is a neural network?
We discussed the universal approximation theorem that I presented in L1 form. Some additional info: If we consider Lp approximation for unbounded domains we need that the activation function is unbounded, if we consider bounded continuous functions in bounded domains then we can approximate uniformly with any continuous non-constant activation function.
For more info about the proof see Horniks paper: https://doi.org/10.1016/0893-6080(91)90009-T.
We also discussed the conjugate gradient method for optimising a neural network, in 2006, Hager and Zhang developed a line-search method for nonlinear functions (a "nonlinear conjugate gradient scheme"), see https://doi.org/10.1145/1132973.1132979. It is implemented in Tensorflow.