Introduction to visualization and data analytics with Python

Python is a powerful high-level programming language that is used extensively for rapid prototyping. Furthermore, Python bindings are the best tools available for all kinds of data analytics and visualization in engineering, computer science, and human related subject studies. In data analytics and visualization with Python, the basic concept is to maintain and organize the raw data in arrays and numeric tables and then use available special visualization bindings to create plots and charts in order to convey an important message or find a hidden relation in your dataset.

One of the main advantages of using Python is its extensive ability for creating a compelling and reproducible visualization that could be shared easily among researchers and collaborators by using a virtual notebook called Jupyter Notebook. However, installing Python and all of the required dependencies might be a little bit difficult for new users. As a result, recently Google launched its virtual notebook that has all the famous and interesting Python bindings, which is called Google Colab. In this section, we'll learn how to initialize the Python environment by using Google Colab.

Creating account and login into google colab environment:

In order to use Python environment available in Google Colab framework, you need to use your Clemson University Google account (g.clemson.edu) or any other personal Google account that you want to use might be fine for creating the Python environment in Google Colab framework.

In order to login into your account and create a blank Python notebook, follow these steps:

  1. Go to Google Colab website: https://colab.research.google.com/

  2. Use Sign in to enter the Google Colab account:

3. Use your g.clemson.edu Google account:

4. After successful login, create a new notebook:

If new notebook is created successfully, you should see this blank notebook, similar to this screenshot:

In the next section, we will introduce Python bindings that are useful for maintaining and organizing the datasets as well as creating the visualization such as numpy, pandas, matplotlib, and seaborn.

DOWNLOADING DATA: