In November, 2015, Google open-sourced its numerical computation library called TensorFlow using data flow graphs. Its flexible implementation and architecture enables you to focus on building the computation graph and deploy the model with little efforts on heterogeous platforms such as mobile devices, hundreds of machines, or thousands of computational devices. TensorFlow is generally very straightforward to use in a sense that most of the researchers in the research area without experience of using this library could understand what’s happening behind the code blocks. TensorFlow provides a good backbone for building different shapes of machine learning applications. However, there’s a large number of potential users, including some researchers, data scientists, and students who may be familiar with many data science concepts/algorithms already but who never get involved in deep learning research/applications, may found it really hard to start hacking. That’s where Scikit Flow comes in to help. Scikit Flow is a simplified interface for TensorFlow, to get people started on predictive analytics and data mining. It helps smooth the transition from the Scikit-learn world of one-liner machine learning into the more open world of building different shapes of ML models. You can start by using fit/predict and slide into TensorFlow APIs as you are getting comfortable. It’s Scikit-learn compatible so you can also benefit from Scikit-learn features like ## Deep Learning ModelsScikit Flow provides a set of high level model classes that you can use to easily integrate with your existing Scikit-learn pipeline code. ## Deep Neural NetworkHere’s an example of 3 layer deep neural network with 10, 20 and 10 hidden units in each layer respectively:
## Custom ModelScikit Flow grows as TensorFlow grows. You can basically insert any TensorFlow code into a custom model function that accepts predictors
## Recurrent Neural NetworkRecurrent neural networks is widely used for many areas, such as text classification, sentiment analysis, etc. Using Scikit Flow, all you need to do is to provide some processing function - Various recurrent units, e.g. GRU, RNN, LSTM
- Bidirectional RNN
- Multi-layer RNN
Example:
## Convolutional Neural NetworkConvolutional Neural Network is widely used in areas like computer vision. Here let’s take a look at the MNIST image classification example from TensorFlow tutorial - Deep MNIST for Experts but using more concise interface provided by Scikit Flow.
## Modelling TechniquesMany data science modelling techniques, including early stopping that’s used very often in Kaggle competition and custom learning rate decay can be used easily. ## Early StoppingYou can provide
## Custom Decay Function for Learning RateHere we give an example of using TensorFlow’s exponential decay function.
More features related to modelling techniques are also available such as multi-output regression/classification, custom class weights, dropout probability, batch normalization, etc. We will continue adding more examples on Github in the future. ## Additional FeaturesScikit Flow provides many additional features to help you easy and streamline your model building experience. It’s evolving very rapidly. We are actively seeking suggestions/ideas and welcoming any pull requests. Join our Gitter to discuss your ideas or drop your feature requests at Github issues. ## Flexible Automatic Input HandlingWe try to make your life easier with automatic handling of various data types, such as numpy array/matrices, pandas/dask data frames, and iterators. For example, sometimes when your dataset is too large to hold in the memory you may want to load it into a out-of-core dataframe with the help of dask library to firstly draw sample batches and then load into memory for training.
## Model PersistenceWe try to make it easy for you to save the model every once a while and continue training it any time in the future. Each estimator has a
## Summaries/TensorBoardTo get nice visualizations and summaries you can use
Then run next command in command line:
and follow reported url in your console to open the tensorboard. More Examples and applications can be found on Github: - Text classification (RNN & Convolution, word and character-level)
- Digits & MNIST (Conv, more Conv and ResNet)
- Language models
- Neural Translation Model
More blogposts about Scikit Flow: - Building Machine Learning Estimator in TensorFlow
- High-level Learn Module in TensorFlow
- Introduction to Scikit Flow and why you want to start learning TensorFlow
- DNNs, custom model and Digit recognition examples
- Categorical variables: One hot vs Distributed representation
- Scikit Flow: Easy Deep Learning with TensorFlow and Scikit-learn
More exciting things are happening! Spoiler alert: we are moving to TensorFlow soon! Stay tuned! Update: skflow has been merged to TensorFlow as its TensorFlow Learn module. Please find most updated examples here. |