Introduction
Laymen explanation
Technical explanation
Properties
Single perceptron neural networks can be visualised equivalent to logistic regression with sigmoid activation functions
Weights of perceptron is similar concept to parameters in the linear regression
Similar to linear regression, weights changes during training phase. It doesn't change during production phase
Neural network with
low number of perceptrons -> Underfit
High number of perceptrons -> Overfit
Multi class neural networks
Below is 2 class NN
Below example is 3 class NN
Below is the general multi class model where K is the number classes.
Learning by error Back-propagation
Where error/loss/cost function is below. Note that the last triple sum is for regularisation and it adds all weights in each layer. Here L is the total number of layers (including input layer and output layer).
For back-propagation, cost function and partial derivate is needed. Note that partial derivate determines slope.
Learning convergence
Cost function is non-linear. For detail about its convergence, refer here.
Validation of correct back propagation
Compute delta approximation using triangular rule
Match with calculated value and see if they are approximately equal
Weight/parameter initialisation
Zero initialisation
It is not appropriate since it creates symmetry. Note that makes all weights same even after back-propagation. It means that all perceptrons are computing same features which makes them redundant.
Random initialisation
It breaks symmetry. Note that each weight should get random value. Same random number should not be repeatedly used.
Neural network architecture design points
More the number of hidden units, better the accuracy. However, it will add up computation cost(Refer here) [Verify]
Application
Autonomous driving
Reference
https://coursera.org/share/6cfef2809f215cf21f240a0f2533ca5c
https://coursera.org/share/3ed8c209f2bb48ad6077af8b00a50173
https://images.app.goo.gl/GUoLbpi5SrmS64ja6
https://images.app.goo.gl/qv5261vUsZKsXXFN9
https://images.app.goo.gl/Zw66SKPVeiaWiPyN8
https://images.app.goo.gl/MBjaeaCMceGANUNn7
https://images.app.goo.gl/RqqyAuwEUGkoHR13A
https://coursera.org/share/4f003dafb50f092f226ce4bae0ebb005
https://coursera.org/share/85e466c279be287f8d4bec63e924be02
https://images.app.goo.gl/X83KkNq4bVuN6K3h6
https://images.app.goo.gl/7LGyyQQWJWihQmym7
https://coursera.org/share/7da89a2292fcf269ca2c1de24bf0d7db
https://sites.google.com/site/jbsakabffoi12449ujkn/home/machine-intelligence/knowing-what-makes-ml-training-converge#TOC-What-about-global-minima-for-non-convex-loss-function-
https://coursera.org/share/3b35e736bb4e983a2527d9d961508b02
https://coursera.org/share/0e12ab3a8d6232738d078a4416e43ad8