Display Decision Tree in Python using pydot and GraphViz

Post date: Dec 27, 2014 1:06:06 AM

When using decision tree from scikit-learn, it's tempting to plot the resulting decision tree into a figure. This requires package pydot and GraphViz to be installed in your machine. However, I spent non-trivial time to do so, so I created this post to save your time.

This post explains the process required for this ipython notebook:

http://nbviewer.ipython.org/github/kittipatkampa/python_dev/blob/master/demo_decision_tree_v1.ipynb

My first attempt with python 3.4 [still does not work]

At first, I tried to run the program on python 3.4, but does not work. I checked all the packages from Anaconda python3.4 and found the package pydot not included, so, I will need to install pydot myself. Note that the original pydot will not work for pyparsing version > 2.0, which is the case for python 3.4. Fortunately, there is a package pydot2, which claims to work with python 3.x and pyparsing version > 2. Just simply use pip to install the package:

pip install pydot2 

Note that the name of package is pydot2, but when importing we still use pydot:

import pydot

However, after spending non-trivial time installing the package, the problem is still not resolved--cannot display the decision tree :-(

So, my next attempt is to resort to python2.7.

Using pydot with python 2.7 and degrade pyparsing

It's not a hunch alone that I turned back to python 2.7. There are reportedly many success cases that use python 2.7; so why not me?

To do so, I create a virtual environment for python2.7 (let's call it mypy27) in Anaconda using this command:

conda create -n mypy27 python=2.7 anaconda

It then went bla bla bla until it says COMPLETE:

Extracting packages ...
[      COMPLETE      ] |#######################################################################################################################################################| 100%
Linking packages ...
[      COMPLETE      ] |#######################################################################################################################################################| 100%
#
# To activate this environment, use:
# $ source activate mypy27
#
# To deactivate this environment, use:
# $ source deactivate
#

However, pydot was not included in the default virtual environment, so I need to install it myself simply using conda install. From the output log file below, you will see that Anaconda takes care of the dependencies automatically; downgrading pyparsing from 2.0.1 to 1.5.6 to comply with pydot.

(mypy27)685b35c8f358:python_dev kittipat$ conda install pydot
Fetching package metadata: ..
Solving package specifications: .
Package plan for installation in environment /Users/kittipat/anaconda3/anaconda/envs/mypy27:
The following packages will be downloaded:
    package                    |            build
    ---------------------------|-----------------
    python-2.7.9               |                1        11.3 MB
    pyparsing-1.5.6            |           py27_0          61 KB
    pydot-1.0.28               |           py27_0          35 KB
    ------------------------------------------------------------
                                           Total:        11.4 MB
The following NEW packages will be INSTALLED:
    pydot:     1.0.28-py27_0
The following packages will be UPDATED:
    openssl:   1.0.1h-1     --> 1.0.1j-4     
    python:    2.7.8-1      --> 2.7.9-1      
The following packages will be DOWNGRADED:
    pyparsing: 2.0.1-py27_0 --> 1.5.6-py27_0 
Proceed ([y]/n)? y
Fetching packages ...
python-2.7.9-1 100% |#####################################################################################################################################| Time: 0:00:18 624.74 kB/s
pyparsing-1.5. 100% |#####################################################################################################################################| Time: 0:00:00 144.88 kB/s
pydot-1.0.28-p 100% |#####################################################################################################################################| Time: 0:00:00 151.27 kB/s
Extracting packages ...
[      COMPLETE      ] |########################################################################################################################| 100%
Unlinking packages ...
[      COMPLETE      ] |########################################################################################################################| 100%
Linking packages ...
[      COMPLETE      ] |########################################################################################################################| 100%

Next, I opened an ipython notebook session in the python 2.7 virtual environment that I just built and tried to plot the graph....

Unfortunately, there was still an error message "GraphViz's executables not found". Don't panic. This is because the pydot cannot find the executable files for GraphViz, and we will need to install it manually.

Install GraphViz

Just go, download and read the installation instruction:

and download the installation file (for me, .pkg for macbook pro) and install it to the default location (the installation file will guide you through). When installed successfully, reopen your ipython session so that it acknowledges GraphViz that we just installed. Run it, and Voila, there you see the nice-looking decision tree on your screen. Feel free to use the example iPython notebook here.