[TBD]Scope of neural networks in data science

Introduction

Laymen explanation

Technical explanation

Properties

Single perceptron neural networks can be visualised equivalent to logistic regression with sigmoid activation functions
Weights of perceptron is similar concept to parameters in the linear regression
Similar to linear regression, weights changes during training phase. It doesn't change during production phase
Neural network with
- low number of perceptrons -> Underfit
- High number of perceptrons -> Overfit

Multi class neural networks

Below is 2 class NN

Below example is 3 class NN

Below is the general multi class model where K is the number classes.

Learning by error Back-propagation

Where error/loss/cost function is below. Note that the last triple sum is for regularisation and it adds all weights in each layer. Here L is the total number of layers (including input layer and output layer).

For back-propagation, cost function and partial derivate is needed. Note that partial derivate determines slope.

Learning convergence

Cost function is non-linear. For detail about its convergence, refer here.

Validation of correct back propagation

- Compute delta approximation using triangular rule
- Match with calculated value and see if they are approximately equal

Weight/parameter initialisation

Zero initialisation

It is not appropriate since it creates symmetry. Note that makes all weights same even after back-propagation. It means that all perceptrons are computing same features which makes them redundant.

Random initialisation

It breaks symmetry. Note that each weight should get random value. Same random number should not be repeatedly used.

Neural network architecture design points

- More the number of hidden units, better the accuracy. However, it will add up computation cost(Refer here) [Verify]

Application

Autonomous driving

Reference

https://coursera.org/share/6cfef2809f215cf21f240a0f2533ca5c

https://coursera.org/share/3ed8c209f2bb48ad6077af8b00a50173

https://images.app.goo.gl/GUoLbpi5SrmS64ja6

https://images.app.goo.gl/qv5261vUsZKsXXFN9

https://images.app.goo.gl/Zw66SKPVeiaWiPyN8

https://images.app.goo.gl/MBjaeaCMceGANUNn7

https://images.app.goo.gl/RqqyAuwEUGkoHR13A

https://coursera.org/share/4f003dafb50f092f226ce4bae0ebb005

https://coursera.org/share/85e466c279be287f8d4bec63e924be02

https://images.app.goo.gl/X83KkNq4bVuN6K3h6

https://images.app.goo.gl/7LGyyQQWJWihQmym7

https://coursera.org/share/7da89a2292fcf269ca2c1de24bf0d7db

https://sites.google.com/site/jbsakabffoi12449ujkn/home/machine-intelligence/knowing-what-makes-ml-training-converge#TOC-What-about-global-minima-for-non-convex-loss-function-