Deep learning models: DNNs, CNNs, RNNs

Deep Neural Network (DNNs)

A deep neural network (DNN) is an artificial neural network with multiple hidden layers of units between the input and output layers. DNNs can model complex non-linear associations. In this site, we will mention two different types of DNN architectures, the multi-layer perceptron neural network (MLPNN) and the stacked autoencoder (SAE).






Image taken from www.opennn.net/

MLPNN

MLPNNs are structured and trained in the exact same way as seen in the Neural Networks 101 section. The only difference in this model is that there are many hidden layers, making the model "deeper" than normal artificial neural networks. This architecture is useful in separating high dimensional data by repeatedly applying non linear activation functions [1].


SAE

An autoencoder is a neural network that involves a encoding and decoding step. Unlike the previous example, it is trained in an unsupervised manner such that the outputs are trained to be equal to the inputs. However, by reducing the number of features in the hidden layer (encoding), it must learn to represent the input features in a smaller space before accurately decoding them back to the original feature space.



An autoencoder is trained in an unsupervised manner by encoding the input features into a smaller feature space, and then decoding them back to the original feature space as closely to the original inputs as possible [2].



Image taken from http://ufldl.stanford.edu/tutorial/unsupervised/Autoencoders/

The SAE model is built using autoencoders. Starting with the input vector, each layer is trained individually such that the hidden layer that the previous layer is decoded to is kept and added to the model [3]. In this way, the SAE is built one layer at a time from the bottom up. Once the model is finished, it is fine-tuned using supervised learning. By deeply encoding and decoding data, SAEs have shown to work well at removing noise from the input data as well as to provide a lower feature representation of the input at the bottleneck layer (seen as layer Z in the image).





Image taken from Berniker and Kording [4]

Convolutional Neural Nets (CNNs)

CNNs are often used on images where adjacent pixel placement has relevance with the data. CNN is a type of a feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the visual cortex. This neural network consists of multiple layers of receptive fields.

Let’s take an example of an octagon. Using CNN, a recognized pattern can be broken down and identified using multiple layers to recreate and identify the pattern as an octagon. In the first layer, the model learns that the most basic shapes of the octagon are diagonal, horizontal, and vertical lines. Then in each subsequent layer, the model learns how to put these shapes together in a way to build the octagon using the features from the previous layer.



The way that a CNN works is by learning different kernel weights [4]. The kernel matrix acts as a sliding window that scans the image, and when a specific feature is found that corresponds to the kernel, you output of that hidden layer will have a high signal. The goal of the model is to train these kernels such that each one can recognize a different feature that is relevant to the classification of the input.

Since it must train multiple kernels, the output of each convolutional layer is a set of images. Each image, or feature map, is the result after being scanned with the respective kernel. Pooling is usually used afterwards to reduce the feature space. In each subsequent convolutional layer, the feature maps are aggregated and new kernels are trained at that layer.


Image from Understanding Convolutional Neural Networks for NLP [5].


Real world CNN Example

A real cool real world machine learning CNN example is how Google lets you search a photo just by description from your own photo library, even if the wasn't tagged that way!



https://medium.com/@ageitgey/machine-learning-is-fun-part-3-deep-learning-and-convolutional-neural-networks-f40359318721

Recurrent Neural Networks (RNNs)

An RNN is a network that excel with sequential data . They exploit the ordering information by passing information from hidden layer from the prior input to the hidden layer of the new input [6, 7]. This allows the model to have past context. For example, to understand the context of a word in a sentence, one needs to know the words that came before it. To handle this, the inputs are arranged in order of their sequence, and each time an input is passed to the model, it also receives information from the previous hidden layer. In a biological sense, the same concept can be applied to DNA and RNA sequences where we are interested in an ordered sequence of nucleotides or amino acids.


Image from Godbout [6].

However, sometimes past information is not the only information that the model can benefit from. If the model could receive future information, it could even further understand the context. Looking at words in a sentence again, a model could get a better idea of the context of the word if it also knew the words that were ahead of it. This gave rise to the bidirectional recurrent neural networks (RNN), where an additional layer comprised of information being directed from the next input to the current input. When trained, this allows for models to combine a hidden layer that receives information from previous layers and a hidden layer which receives information from future layers into the output.





Image from Lipton and Berkowitz [7].


Real World RNN Example

An RNN model can be trained on lots of text and then new text, which shall be word-by-word, can be generated using the model. The model can be trained to learn how to spell, punctuate and even write complete sentences.

Large recurrent neural networks are used to learn the relationship between items in the sequences of input strings and then generate text.




http://karpathy.github.io/2015/05/21/rnn-effectiveness/

References

  1. Le QV. A Tutorial on Deep Learning - Part 1: Nonlinear Classi- fiers and The Backpropagation Algorithm. 2015. http://robotics.stanford.edu/~quocle/ tutorial1.pdf. Accessed May 3, 2017.
  2. Autoencoders. UFLDL Tutorial. http://ufldl.stanford.edu/tutorial/unsupervised/Autoencoders/. Accessed May 3, 2017.
  3. Min S, Lee B, Yoon S. Deep learning in bioinformatics. Brief Bioinform 2016. doi: 10.1093/bib/bbw068
  4. Berniker M, Kording KP. Deep networks for motor control functions. Frontiers in Computational Neuroscience. 2015;9(32). doi:10.3389/fncom.2015.00032.
  5. Understanding Convolutional Neural Networks for NLP. WildML. Novermber 7, 2015. http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/. Accessed May 3, 2017.
  6. Godbout C. Recurrent Neural Networks for Beginners. Medium. April 12, 2016. https://medium.com/@camrongodbout/recurrent-neural-networks-for-beginners-7aca4e933b82. Accessed May 3, 2017.
  7. Lipton ZC, Berkowitz J. A Critical Review of Recurrent Neural Networks for Sequence Learning. October 17, 2015. arXiv:1506.00019v4.