## The fast and easy guide to the most popular Deep Learning framework in the world.Make sure to check out the other articles here. ## OverviewIn this installment we will be going over all the abstracted models that are currently available in TensorFlow and describe use cases for that particular model as well as simple sample code. Full sources of working examples are in the TensorFlow In a Nutshell repo. A recurrent neural network ## Recurrent Neural Networks
Since the advent of Long Short Term Memory and Gated Recurrent Units, Recurrent Neural Networks have made leaps and bounds above other models in natural language processing. They can be fed vectors representing characters and be trained to generate new sentences based on the training set. The merit in this model is that it keeps the context of the sentence and derives meaning that “cat sat on the mat” means the cat is on the mat. Since the creation of TensorFlow writing these networks have become increasingly simpler. There are even hidden features covered by Denny Britz here that make writing RNN’s even simpler heres a quick example. import tensorflow as tf
Convolution Neural Network ## Convolution Neural Networks
Convolution Neural Networks are unique because they’re created in mind that the input will be an image. CNNs perform a sliding window function to a matrix. The window is called a kernel and it slides across the image creating a convolved feature. from http://deeplearning.standford.edu/wiki/index.php/Feature_extraction_using_convolution Creating a convolved feature allows for edge detection which then allows for a network to depict objects from pictures. edge detection from GIMP manual The convolved feature to create this looks like this matrix below: Convolved feature from GIMP manual Here’s a sample of code to identify handwritten digits from the MNIST dataset. ### Convolutional network def conv_model(X, y): # first conv layer will compute 32 features for each 5x5 patch # second conv layer will compute 64 features for each 5x5 patch. # reshape tensor into a batch of vectors # densely connected layer with 1024 neurons. ## Feed Forward Neural Networks
These networks consist of perceptrons in layers that take inputs that pass information on to the next layer. The last layer in the network produces the output. There is no connection between each node in a given layer. The layer that has no original input and no final output is called the hidden layer. The goal of this network is similar to other supervised neural networks using back propagation, to make inputs have the desired trained outputs. These are some of the simplest effective neural networks for classification and regression problems. We will show how easy it is to create a feed forward network to classify handwritten digits: def init_weights(shape): def model(X, w_h, w_o): mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) X = tf.placeholder("float", [None, 784]) w_h = init_weights([784, 625]) # create symbolic variables py_x = model(X, w_h, w_o) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(py_x, Y)) # compute costs # Launch the graph in a session for i in range(100): ## Linear Models
Linear models take X values and produce a line of best fit used for classification and regression of Y values. For example if you have a list of house sizes and their price in a neighborhood you can predict the price of house given the size using a linear model. One thing to note is that linear models can be used for multiple X features. For example in the housing example we can create a linear model given house sizes, how many rooms, how many bathrooms and price and predict price given a house with size, # of rooms, # of bathrooms. import numpy as np def weight_variable(shape): # dataset # model # training and cost function # create a session # train ## Support Vector Machines
The general idea behind a SVM is that there is an optimal hyperplane for linearly separable patterns. For data that is not linearly separable we can use a kernel function to transform the original data into a new space. SVMs maximize the margin around separating the hyperplane. They work extremely well in high dimensional spaces and and are still effective if the dimensions are greater than the number of samples. def input_fn(): price = tf.contrib.layers.real_valued_column('price') svm_classifier.fit(input_fn=input_fn, steps=30) ## Deep and Wide Models
Deep and Wide models were covered with greater detail in part two, so we won’t get too heavy here. A Wide and Deep Network combines a linear model with a feed forward neural net so that our predictions will have memorization and generalization. This type of model can be used for classification and regression problems. This allows for less feature engineering with relatively accurate predictions. Thus, getting the best of both worlds. Here’s a code snippet from part two’s github. def input_fn(df, train=False): m = build_estimator(model_dir) ## Random Forest
Random Forest model takes many different classification trees and each tree votes for that class. The forest chooses the classification having the most votes. Random Forests do not overfit, you can run as many treees as you want and it is relatively fast. Give it a try on the iris data with this snippet below: hparams = tf.contrib.tensor_forest.python.tensor_forest.ForestHParams( iris = tf.contrib.learn.datasets.load_iris() monitors = [tf.contrib.learn.TensorForestLossMonitor(10, 10)] ## Bayesian Reinforcement Learning
In the contrib folder of TensorFlow there is a library called BayesFlow. BayesFlow has no documentation except for an example of the REINFORCE algorithm. This algorithm is proposed in a paper by Ronald Williams. REward Increment = Nonnegative Factor * Offset Reinforcement * Characteristic Eligibility This network trying to solve an immediate reinforcement learning task, adjusts the weights after getting the reinforcement value at each trial. At the end of each trial each weight is incremented by a learning rate factor multiplied by the reinforcement value minus the baseline multiplied by characteristic eligibility. Williams paper also discusses the use of back propagation to train the REINFORCE network. """Build the Split-Apply-Merge Model. # REINFORCE forward step ## Linear Chain Conditional Random Fields
CRFs are conditional probability distributions that factoirze according to an undirected model. They predict a label for a single sample keeping context from the neighboring samples. CRFs are similar to Hidden Markov Models. CRFs are often used for image segmentation and object recognition, as well as shallow parsing, named entity recognition and gene finding. # Train for a fixed number of iterations. ## ConclusionEver since TensorFlow has been released the community surrounding the project has been adding more packages, examples and cases for using this amazing library. Even at the time of writing this article there are more models and sample code being written. It is amazing to see how much TensorFlow as grown in these past few months. The ease of use and diversity in the package are increasing overtime and don’t seem to be slowing down anytime soon. |