Now we will start looking at some more advanced topics to improve our Data Warehouse.
Up to this point we have assumed dimensions to be independent of time. While dimension table attributes are relatively static, they are not fixed forever. For example, an Airline may change its name. The approach we took so far is just to overwrite the old value with the new value. In some cases this may be fine. But what if the user one day wants to make a query using the previous name of the airline? In that case we have a problem because the old value is gone.
To solve this kind of issue, Slowly Changing Dimensions (SCD) were invented. With these, instead of overwriting the existing value in our Dimension table, we will add a new row to it. This, however, brings up another problem: so far we have used the Airline code (UNIQUE_CARRIER) as unique key in our dimension. But now we might have more than 1 row with the same Airline code.
To solve this issue, we introduce the concept of the Surrogate Key. This is a sequential and therefore unique number for each row in our Dimension table. So we end up with 2 keys in the Dimension table. The Surrogate key which is truly unique and the Airline code (which from now on we will call the Natural key), which may not be unique.
To make this more clear, on the next page , we will see the new workflow to load the Carrier SCD. We will also have to modify how we load our Facts table, which will include the Surrogate Keys instead of the Natural Keys, but that comes laterÂ