Sample Dimension Tables and Columns

Date dimension

Time dimension

FactX_Other_Dimension
 
Sample columns for other dimension tables
Column  /  AlternativeData Type Comment Description  /  Examples  Notes
DimensionX_Key
DimensionXKey


IntegerSurrogate key arbitrary number that uniquely identifies an instance of DimensionX for a period of time for type 2 slowly changing dimensions and forever for other dimensions, regardless of changes to a "natural" or "native" key in source systems.
1, 2, 3, ....
Margy Ross wrote in Intelligent Enterprise, October 1, 2006, "Another data warehousing best practice is to create surrogate keys--typically a meaningless simple integer--for the primary key in each dimension table."
DimensionX_Durable_Key
DimensionXDurableKey
 IntegerDurable surrogate key arbitrary number that uniquely identifies an instance of DimensionX across type 2 slowly changing dimension rows, regardless of changes to a "natural" or "native" key in source systems.
1, 2, 3, ....
Ralph Kimball wrote in Intelligent Enterprise, June 21, 2010, "we call it durable [key] to emphasize that the EDW must guarantee that it never changes, even if the source system has a special business rule that can cause [the natural key] to change...."



Gender_ID
GenderID
VarChar (6)
'lookup'
Text that identifies a gender, sometimes called sex.
Female, Male, and NA.
Gender_Short_ID
GenderShortID
Char (1)Text that identifies a gender, sometimes called sex, using its first letter.
F, M, and N (for not applicable).
   
Effective_Date_Time
StartDate
DateTimeDate and time a version of an instance of a type 2 slowly changing dimension (i.e., a row) became effective.
... 2011-05-20 23:59:59, 2011-05-21 00:00:00, ....
If preferred, we could omit the time portion of this column, leaving only the date.
Expired_Date_Time
EndDate
DateTimeDate and time a version of an instance of a type 2 slowly changing dimension (i.e., a row) expired, if it expired.
... 2011-05-20 23:59:59, 2011-05-21 00:00:00, …, 9999-12-31 00:00:00.
If a row has not expired, the value of this column (Expired_Date_Time) is 9999-12-31 00:00:00.
When a row has expired, the value of this column is exactly equal to the Effective_Date_Time of the next version of that instance of the type 2 slowly changing dimension.
If preferred, we could omit the time portion of this column, leaving only the date.
Ralph Kimball and Margy Ross wrote in The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition), on page 190:
"[T]he ... expiration date/time ... represents the date/time of the next [version] for this [instance].... The ... expiration date/time in the most current [version] must be set to an arbitrary time in the future."
Most_Recent_Flag
IsCurrent
Char (3)
'lookup'
Text that indicates whether a version of an instance of a type 2 slowly changing dimension (i.e., a row) is the most recent version (i.e., the current row).
No, Yes, and NA.
Margy Ross and Ralph Kimball wrote in Intelligent Enterprise, March 1, 2005, "With a type 2 change, ... rows contain ... the most-recent-row flag...."
Although this column contains no information that is not in Expired_Date_Time, it is easier for most users to understand that the current version has a Most_Recent_Flag of Yes than an Expired_Date_Time of 9999-12-31 00:00:00.
Source_System_Name
SourceSystemName
VarChar (50)
'lookup'
Name of the system that is the primary source of the data in a row.
Sale System, A/P System, and NA.
Source_System_DimensionX_ID
SourceSystemDimensionXID
VarChar (50)Text that identifies an instance of DimensionX in the primary source system.
IBM, 357910, and NA.
  • All columns should be Not Null since the Infobright loader will automatically replace all imported null values with default values such as 0 (zero) for numeric columns and an empty string (‘’) for string columns.
  • NA as the value of any column in any table means the value is not applicable, which includes cases where the value is not available efficiently. When possible, as suggested by Bob Becker of the Kimball Group in "Design Tip #128 Selecting Default Values for Nulls":

"Consider each condition separately and provide as many default rows as needed to provide the most complete understanding of the data as possible. At a minimum consider the following default rows:

  • Missing Value – The source system did not provide a value that would enable looking up the appropriate foreign key. This could indicate a missing data feed in the ETL process.
  • Not Happened Yet – The missing foreign key is expected to be available at a later point in time.
  • Bad Value – The source provided bad data or not enough data to determine the appropriate dimension row foreign key. This may be due to corrupted data at the source, or incomplete knowledge of the business rules for this source data for this dimension.
  • Not Applicable – This dimension is not applicable to this fact row.

Every dimension needs a set of default rows to handle these cases. Usually the ETL team assigns specific values such as 0, -1, -2, and -3 to the keys that describe these alternatives."

Subpages (3): Date Other Time