Trang chủ‎ > ‎IT‎ > ‎Data Science - Python‎ > ‎Neural Networks‎ > ‎

Keras implementation of MLP

Notes:

# This is a Keras implementation of a multilayer perceptron (MLP) neural network model.
Keras is a deep learning library for Theano and TensorFlow.
# Click here to know more about the MLP model.
# The MLP code shown below solves a binary classification problem.
# This script also prints Area Under Curve (AUC) and plots a Receiver Operating Characteristic (ROC) curve at the end.
# I have tested the code in Python 2.7+ 
# Required Python modules: Keras, sklearn, pandas, matplotlib



Description + code:

  • First, import all the Python modules. 
  • From Keras, import the Sequential model as well as the Dense, Dropout and the Activation layers. The Sequential model is a linear stack of layers. Click here for more details on the Sequential model.
  • Import test_train_split, roc_curve and auc from sklearn.
  • Import the matlab-like plotting framework pyplot from matplotlib.
#!/usr/bin/env python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
from sklearn.cross_validation import train_test_split
from sklearn.metrics import roc_curve, auc
import pandas as pd
import matplotlib.pyplot as plt
  • Below is the Python module that initializes the neural network. 
  • It comprises a Sequential model that has 3 Dense layers, where each Dense layer is followed by an Activation layer. 
  • Note that I used Dropout layer only after the first two Activation layers. 
  • The first two Activation layers have 'tanh' as the activation function. 
  • For the last Activation layer, I used 'softmax' because it is a binary classification problem. 
  • I used Stochastic Gradient Descent with Nesterov momentum for training. 
  • Typically, 'binary_crossentropy' is used for binary classification problems. This is similar to 'logloss'. 
# Initialize the MLP
def initialize_nn(frame_size):
    model = Sequential() # The Keras Sequential model is a linear stack of layers.
    model.add(Dense(100, init='uniform', input_dim=frame_size)) # Dense layer
    model.add(Activation('tanh')) # Activation layer
    model.add(Dropout(0.5)) # Dropout layer
    model.add(Dense(100, init='uniform')) # Another dense layer
    model.add(Activation('tanh')) # Another activation layer
    model.add(Dropout(0.5)) # Another dropout layer
    model.add(Dense(2, init='uniform')) # Last dense layer
    model.add(Activation('softmax')) # Softmax activation at the end
    sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True) # Using Nesterov momentum
    model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy']) # Using logloss
    return model
  • Generate train.csv and target.csv and place them in the same folder so that the module below can read them.
  • train.csv should contain rows of size 'n_frames' and columns of size 'frame_size'
  • target.csv should contain rows of size 'n_frames' and 2 columns (like [1 0; 0 1; 1, 0; ...]), because it is a binary classification problem (1 - true and 0 - false).
# Get data
def get_data():
    X = pd.read_csv("train.csv")
    y = pd.read_csv("target.csv")
    n_frames, frame_size = X.shape
    return X, y, n_frames, frame_size
  • Next is the plot module.
  • First we generate false positive and true positive rates using 'roc_curve'.
  • We then estimate the area under curve.
  • We then plot the final ROC curve using the pyplot feature in matplotlib. 
# Plot data
def generate_results(y_test, y_score):
    fpr, tpr, _ = roc_curve(y_test, y_score)
    roc_auc = auc(fpr, tpr)
    plt.figure()
    plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
    plt.plot([0, 1], [0, 1], 'k--')
    plt.xlim([0.0, 1.05])
    plt.ylim([0.0, 1.05])
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title('Receiver operating characteristic curve')
    plt.show()
    print('AUC: %f' % roc_auc)
  • Below we call all the modules. 
  • First thing you will see is a function call 'get_data()'. 
  • This is the function that you need to write that returns X, y, n_frames, and frame_size -- all the training data that will be used in the next steps. 
  • I then used a module from sklearn that nicely splits the data into a training dataset and a testing dataset. 
  • After data splitting, train the model by first initializing the MLP using 'initialize_nn'. 
  • I used 10 epochs but you can change this number depending on your need. 
  • Once the model is trained, predictions are made on the test data, followed by some plotting. 
  • I have used sklearn modules such as 'roc_curve' and 'auc' to generate some plots/results.
# Calling all modules
print('Loading and reading data')
X, y, n_frames, frame_size = get_data()

print('Splitting data into training and testing')
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=10)

print('Initializing model')
model = initialize_nn(frame_size)

print('Training model')
model.fit(X_train, y_train,
          batch_size=32, nb_epoch=10,
          verbose=1, callbacks=[],
          validation_data=None,
          shuffle=True,
          class_weight=None,
          sample_weight=None)

print('Predicting on test data')
y_score = model.predict(X_test)

print('Generating results')
generate_results(y_test[:, 0], y_score[:, 0])
Comments