This topic is about how to start building a Data Warehouse using KNIME software.
The source data is about airline flight information. It can be found in the following file FlightData.csv
The Data Warehouse will be stored in a Postgres database on the local machine
We will be creating a Snowflake Schema. In such you have a central fact table, surrounded by dimensions. There will be a separate table for each level in the Dimension , so for instance we will have 3 tables for the dimensional hierarchy Country -> City -> Airport.
The central Fact table will contain the following facts : Passengers, Freight and Mail. These represent the number of passengers and the kilograms of freight and mail transported on each flight.
The dimensions we will create will be : DIM_carrier (airlines) , DIM_cities (cities of origin and destination), DIM_country (countries of origin and destination) and DIM_origin (origin airport codes and names).
To interface with the Postgres database we will use the Web interface, which has the following URL : localhost/phppgadmin/ . Mind you this will only work if you have Postgres installed on your local machine. Later on I will add information on how to do this and on alternative approaches.
So, let's get started! In the following workflow we will be making a simple copy of the entire input CSV file to the database. This is often called a 'staging table'.