Trang chủ‎ > ‎IT‎ > ‎Data Science - Python‎ > ‎Tensorflow‎ > ‎

Integrate a TensorFlow experiment with Neptune Example - Flower Species Prediction

Overview

In this example we will show you how easy it is to integrate a TensorFlow experiment with
Neptune.

We will adapt the Quick Start from the TensorFlow library to show the ease of integration between the two tools. The example consists of a single Python file and uses TensorFlow to train and evaluate a simple 
neural network that predicts flower species (iris).

Integrating the code with the Neptune Client Library will allow us to run the code
as a Neptune job, and to browse the experiment’s results in the Web UI.
Neptune will store and display all metrics and graphs generated by TensorFlow in real time.

Dataset Information

Dataset: Iris.

Dataset size: 150 samples (120 samples of the training set and 30 samples of the test set).

Dataset description: The data set consists of 50 samples from each of three species
of iris (Iris setosaIris virginica, and Iris versicolor). Each row contains the following data
for each flower sample: sepal length, sepal width, petal length, petal width, and flower
species. The flower species are represented as integers with 0 denoting Iris setosa,
1 denoting Iris versicolor, and 2 denoting Iris virginica.

Business purpose: Predict species of flowers.

Dataset credits: Ronald Fisher (1936) “The use of multiple measurements in taxonomic
problems”.

From left to right: Iris setosa, Iris versicolor, and Iris virginica

Let’s Start Editing the Code!

To run the code from this example, you need to have the following installed:

If you want to download the code that is ready to run in Neptune, it’s available on GitHub.

To create Neptune job step by step follow this example.

Create a directory named flower-species-prediction. Below you can see the full code
for the neural network classifier obtained from the TensorFlow documentation.
 Save this code in a file named main.py in the created flower-species-prediction directory.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import numpy as np

# Data sets
IRIS_TRAINING = "iris_training.csv"
IRIS_TEST = "iris_test.csv"

# Load datasets.
training_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TRAINING,
                                                       target_dtype=np.int)
test_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TEST,
                                                   target_dtype=np.int)

# Specify that all features have real-value data
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)]

# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
                                            hidden_units=[10, 20, 10],
                                            n_classes=3,
                                            model_dir="/tmp/iris_model")

# Fit model.
classifier.fit(x=training_set.data,
               y=training_set.target,
               steps=2000)

# Evaluate accuracy.
accuracy_score = classifier.evaluate(x=test_set.data,
                                     y=test_set.target)["accuracy"]
print('Accuracy: {0:f}'.format(accuracy_score))

# Classify two new flower samples.
new_samples = np.array(
    [[6.4, 3.2, 4.5, 1.5], [5.8, 3.1, 5.0, 1.7]], dtype=float)
y = classifier.predict(new_samples)
print('Predictions: {}'.format(str(y)))

To see the charts and graphs generated by TensorFlow in Neptune, all you need to do is
add a few lines of code after imports:

from deepsense import neptune
context = neptune.Context()
context.integrate_with_tensorflow()

The first line is an import of the Neptune client library. The second line will simply
create a Neptune Context. The invocation of context.integrate_with_tensorflow() 
will extend TensorFlow to communicate with Neptune.

Once we have the source file, we need to prepare a short configuration file describing
the job:

config.yaml

name: Iris
description: Flower Species Prediction
project: Iris
parameters:
  - name: data_dir
    description: Directory with input data
    type: string
    required: false
    default: flower-species-prediction/data/
  - name: model_dir
    description: Directory to save model parameters, graph and etc.
    type: string
    required: false
    default: /tmp/iris_model

The configuration file contains: the job’s name, description and the project that
the job belongs to. We also define additional parameters to make our job more generic.
The parameter data_dir allows users to change the location of the input datasets.
The parameter model_dir represents the directory where data generated by TensorFlow
will be stored. Save the configuration in a config.yaml file in the flower-species-prediction directory.

Now we can use defined parameters in the code. Let’s replace declarations of the paths to
datasets:

IRIS_TRAINING = "iris_training.csv"
IRIS_TEST = "iris_test.csv"

with

IRIS_TRAINING = context.params.data_dir + "iris_training.csv"
IRIS_TEST = context.params.data_dir + "iris_test.csv"

and classifier declaration:

classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
                                            hidden_units=[10, 20, 10],
                                            n_classes=3,
                                            model_dir="/tmp/iris_model")

with

classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
                                            hidden_units=[10, 20, 10],
                                            n_classes=3,
                                            model_dir=context.params.model_dir)

NOTE!

TensorFlow made the method tf.contrib.learn.datasets.base.load_csv deprecated,
but the documentation wasn’t updated. To run this example we need to fix this issue
and replace:

training_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TRAINING,
                                                       target_dtype=np.int)
test_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TEST,
                                                   target_dtype=np.int)

with

training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
    filename=IRIS_TRAINING, features_dtype=np.float64, target_dtype=np.int)
test_set = tf.contrib.learn.datasets.base.load_csv_with_header(
    filename=IRIS_TEST, features_dtype=np.float64, target_dtype=np.int)

One last thing needed to run our code is datasets. Let’s download train and test 
datasets to the data directory where you will run this example.

Your directory structure should look like this:

Directory structure for flower species prediction

Let’s Run the Job!

Now we can run the job using the neptune run command. From the parent directory
of the flower-species-prediction directory, run:

$ neptune run flower-species-prediction/main.py --config flower-species-prediction/config.yaml --storage-url /tmp/neptune-iris --paths-to-dump flower-species-prediction

The job is now running. It will take a few seconds to complete.

The command’s output will contain a link to the job’s dashboard in the Web UI.

> Job enqueued, id:
>
> To browse the job, follow:
> https://[your Neptune IP address]/#dashboard/job/f458797d-1579-4f90-bf4c-b29efcef3377
>

The job should finish after a few seconds and report accuracy of 0.966667.

Browse the Job’s Dashboard

Monitoring metrics reported by TensorFlow

Let’s follow the link displayed by neptune run and see the job’s dashboard.

Metrics reported by TensorFlow

We can see all the metrics reported by TensorFlow, displayed as charts. We can go the
TensorFlow tab, and see graphs:

Graph registered by TensorFlow

Summary

We saw how easy it is to integrate TensorFlow code with Neptune. Thanks to that, 
we could monitor the execution of our experiment in real time. We had access to
all metrics and graphs reported by TensorFlow. We could take advantage of features
provided by both of these tools.

Comments