Building on our Flight data Data Warehouse example, in this topic we will see how to retrieve and incorporate new flight data from the Internet.
In this example, we will use the API from AviationStack. It has a very good free tier to experiment with. All you have to do is sign up and create an API key. Here is the link to the documentation.
Now we can start creating our Workflow. The objective of the workflow is to retrieve the latest flight data from AviationStack and write it into a new Staging table. The final objective is to incorporate this data as Facts into our Data Warehouse, but this will come later. I will now go over the nodes in our workflow:
In the GET Request node, we request the data from the AviationStack API. A required parameter is the API key. See above on how to get this. The information is returned in JSON format.
In the JSON Path node, we specify which data we want to extract from the JSON object.
As this data is returned as an array, we use the Ungroup to create one table row for each array element.
As in our result we have 2 columns called Airport, in the Column Renamer node we rename these to Origin_airport and Destination_airport.
In the Column Filter node, we remove the columns we are not interested in.
Finally, in the DB Writer node, we write the output to our staging table.
Below you see a picture of the newly created Flight data Staging Table