Starting to read WITS data using Google Colab notebook
The following steps will serve as a guideline to read and clean data from UK Importation available on an internet page:
Create a new notebook using the Google Colab environment as described in section 1.2: Designing the solution.
Read data and create a data frame using the table with the names of the available countries from the UK overseas trade in goods statistics site:
The information from the previous site could be transformed into a table format (pandas data frame format) with the following command:
import pandas as pd
url = 'https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1059771/OTS_IMP_2201.xlsx'
# To get all tables in all tabs.
dfs = pd.read_excel(url, sheet_name=None)
dfs
4. After the previous command, variable dfs contains the data in a format of a dictionary of data frames. Now, select the data of importations from EU by using the key 'EU Imports':
# Picking one of three available data frame.
df1 = dfs['EU Imports']
5. Remove the first two rows since it is not numerical data:
df1.drop(df1.index[[0,1]], inplace=True)
df1
6. Reset the values in the index column:
df1.drop(df1.tail(1).index,inplace=True)
7. The Python code with all the steps is summarized in this Google Colab (click on the link):
https://colab.research.google.com/drive/1mUKtrs0kJ8ve4t096mcArt6jxGhm3Q10?usp=sharing