Deepak's homepage

[TBD]Learnings from Andrew Ng sessions

Definition of machine learning

A computer program is said to learn from experience E with respect to a task T and some performance measure P if performance on T, measured by P, improves with experience E

For example, refer here

Machine learning languages

- Octave . Refer here and here
- R
- numpy
- Python
- Matlab

Background

Remember derivative meaning

Matrix basics

Machine learning algorithms

Supervised learning
Unsupervised learning
Reinforcement learning
Recommender system

Some questions for strengthening your understanding

Univariate linear regression

Function

y=f(x)

Refer: https://www.youtube.com/watch?v=kHwlB_j7Hkc&list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN&index=4

Linear regression optimisation

Gradient descent

Updation of variables

After each iteration, find the tangent of the error curve
Simultaneously update all parameters

Understanding gradient

If the variable is in left of the local minima, then slope will be negative and so, the variable value will increase and so coming closure to local minima
If the variable is in right of the local minima, then slope will be positive and so, the variable value will decrease and so coming closure to local minima
If the variable will be at local minima, then slope will be zero and so, value will not change ideally

Learning rate

Slow learning rate will converge to local minima, but it will take many iteration
Very high learning rate may not converge or even diverge

Adjustment of learning rate over time

- No need to decrease learning rate since gradient descent will automatically take smaller steps over the time.

Local minima

There are many local minima
Based on initial point, solution can converge to different local minima

Multi-variate linear regression

Gradient descent in multivariate case

Feature scaling

Why we need to normalise?

Mean normalisation

Gradient descent convergence

Polynomial regression

Normal equation

Gradient descent versus Normal equation

Linear regression for classification problem

Linear regression is not suitable for classification problem

By luck, below algorithm predicts correctly

Adding a new training data makes prediction wrong

Logistic classification (AKA logistic regression)

Hyphothesis value should be in the range of 0 and 1 for decision problems. Logistic regression method ensures this

Non-linear decision boundaries

More complex shapes

Logistic Regression Cost Function

Optimisation algorithm

Multiclass classification

Overfitting

Addressing overfitting

Regularised linear regression

Addressing overfit via regularisation

Solving non-invertibility via regularisation

Motivation for using neural networks

- With many parameters, the hypothesis function will have lots of features or else hypothesis will not be good fit

- Machine needs to covert pixel values to the object

Neural networks

Neural networks processing

Reference

Andrew Ng sessions

https://www.gnu.org/software/octave/

https://www.quora.com/Why-does-Andrew-Ng’s-Machine-Learning-course-use-Octave-instead-of-R

Page updated

Google Sites

Report abuse