[TBD]Knowledge transfer in machine learning

Introduction

Laymen explanation

It is generally not a good idea to train a very large DNN from scratch: instead, you should always try to find an existing neural network that accomplishes a similar task to the one you are trying to tackle. Below is the say from Andrew Ng

“Transfer learning will become a key driver of machine learning success in the industry.”

–Andrew Ng, 2016 Conference on Neural Information Processing Systems

Technical explanation

Transfer learning(Knowledge transfer) is a method wherein a model developed for a particular task is used as a starting point for another task. By model here, we mean a neural network that is trained with data and knowledge gained while solving one problem. For example, the knowledge gained in learning to recognize crocodiles can be used to recognize alligators because they have a lot of features in common.

Lorien Pratt published the first known paper on transfer learning in 1993. Since then, there has been a lot of research in this space.

Jargons

Pre-trained model

The pre-trained model is a model used by someone else to solve a problem that is similar in nature to our problem.

Pre-trained models are usually not 100% effective, but they will serve as a good starting point and save the time and effort of starting from scratch.

KT flow

Below flow mentions which weights are fixed and which are trained in target problem domain

What if you don't get the pre-trained model

Suppose you want to tackle a complex task for which you don’t have much labeled training data, but unfortunately you cannot find a model trained on a similar task. Don’t lose all hope!

First, you should of course try to gather more labeled training data, but if this is too hard or too expensive, you may still be able to perform unsupervised pre-training.

How it works?

If you have plenty of unlabeled training data, you can try to train the layers one by one, starting with the lowest layer and then going up, using an unsupervised feature detector algorithm. Each layer is trained(refer below image) on the output of the previously trained layers (all layers except the one being trained are frozen).

Example of such algorithms are Restricted Boltz‐ mann Machines, autoencoders.

Once all layers have been trained this way, you can fine- tune the network using supervised learning (i.e., with backpropagation). Refer below diagram for the example flow

This video mentions approach in detail.

Transfer learning types (understand it)

transductive
inductive

Example domains

Natural language processing

Example problem is to use trained model in a particular language for summary of paragraph

Speech recognition

Example problem is to use trained model for wake word recognition. For example, amazon Alexa wakes when Alexa word it heards of

Image recognition

Example problem is to use trained model for radiology diagnosis

Example models using transfer learning

Embeddings from Language Models (ELMo)
Bidirectional Encoder Representations from Transformers (BERT)
Refer this paper for transfer learning with Logistic regression(Is it correct?)

Point to remember

Transfer learning will work only well if the inputs have similar low-level features. For example, If the input pictures of your new task don’t have the same size as the ones used in the original task, you will have to add a prepro‐ cessing step to resize them to the size expected by the original model.

Reference

https://www.amazon.in/Hands-Machine-Learning-Scikit-Learn-TensorFlow/dp/1491962291

https://images.app.goo.gl/SUM97pC8QH7P4rfM9

https://www.aismartz.com/blog/an-introduction-to-transfer-learning/

https://www.topbots.com/transfer-learning-in-nlp/

https://images.app.goo.gl/g5fQ1tQLMz5iSwu86

https://images.app.goo.gl/nK8DBtWVPRyQ1QcX9

https://youtu.be/yofjFQddwHE?t=455

https://images.app.goo.gl/paKS1XqjD4yTvNRC7

https://youtu.be/_GRvjpJVr5A

https://www.researchgate.net/publication/283184357_TRANSFER_LEARNING_BASED_ON_LOGISTIC_REGRESSION

Page updated

Google Sites

Report abuse