Matplotlib

Visualization Data using Matplotlib

Matplotlib is the standard python visualization library. One of the core aspects of Matplotlib is matplotlib.pyplot. Each pyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc.

Pandas has a built-in implementation of Matplotlib that we can use. Plotting in pandas is as simple as appending a .plot() method to a series or dataframe.

Line Plot

A line chart or line plot is a type of plot which displays information as a series of data points called ‘markers’ connected by straight line segments. It is a basic type of chart common in many fields. Use line plot when we have a continuous data set. These are best suited for trend-based visualization of data over a period of time.

pandas automatically populated the x-axis with the index values (years), and the y-axis with the column values

Basic Notations for Plot As your Wish

What can be done by Matplotlib

Line Plot

Area Plot

  • Stacked Area

  • UnStacked Area

Histogram

Bar Charts

Notations

Let’s annotate this on the plot using the annotate method of the scripting layer or the pyplot interface. We will pass in the following parameters:

  • s: str, the text of annotation.

  • xy: Tuple specifying the (x,y) point to annotate (in this case, end point of arrow).

  • xytext: Tuple specifying the (x,y) point to place the text (in this case, start point of arrow).

  • xycoords: The coordinate system that xy is given in – ‘data’ uses the coordinate system of the object being annotated (default).

  • arrowprops: Takes a dictionary of properties to draw the arrow:

    • arrowstyle: Specifies the arrow style, '->' is standard arrow.

    • connectionstyle: Specifies the connection type. arc3 is a straight line.

    • color: Specify color of arrow.

    • lw: Specifies the line width.

About Subplot Creation

nrows and ncols are used to notionally split the figure into (nrows * ncols) sub-axes,
plot_number is used to identify the particular subplot that this function is to create within the notional grid. plot_number starts at 1, increments across rows first and has a maximum of nrows* ncols as shown below.

Often times we might want to plot multiple plots within the same figure. For example, we might want to perform a side by side comparison of the box plot with the line plot of China and India’s immigration.

To visualize multiple plots together, we can create a figure (overall canvas) and divide it into subplots, each containing a plot. With subplots, we usually work with the artist layer instead of the scripting layer.