To see if our Change Data Capture setup is working, we will now re-execute our 'get_data_from_web' Workflow to get the most reent updates from AviationStack.
... and after execution, we now get the following rows in our aviationstack_cdc table :
Note that, in this instance all the operation_types are 'INSERTS' (new rows) , so all the 'before' values are empty. But we do have the 'after' values and also a timestamp. We will be using all this information to revisit our load_facts_from_aviationstack Workflow to incrementally load our Data Warehouse.
Below is the first new version of our updated facts loading Workflow. The important point is that, instead of getting its input from the aviationstack_flightdata_staging table, which might possibly have millions of rows, we just get the input from the aviationstack_cdc table, which only contains the new, changed or deleted rows.
Basically, this workflow is very similar to the one we saw before . The main difference being that here we are loading from the CDC table.
On the next page we will further refine this workflow, adding a load_statistics. table to keep track of the data loaded so far and clean up and/or back up our CDC table.