CDC
Change Data Capture (CDC)
CDC captures all the changes (insert, delete, update) to a table into a separate change table. SSIS can take advance of this feature to implement incremental load of data. This is very beneficial when the source data table is very large and takes long time to load.
Firstly, enable CDC on the database
USE EDW_000_Source_Database
GO
EXEC sys.sp_cdc_enable_db;
GO
Secondly, enable CDC on the table
USE EDW_000_Source_Database
GO
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'CLIENTDETAILS',
@role_name = NULL,
@supports_net_changes = 1;
GO
The database will create a few cdc tables and stored procedures subsequently in the System Tables and System Stored Procedures to accommodate the capture of change data. All the changes are stored in cdc.dbo_XXX_CT table (here XXX is the original table name). The following an example:
__$end_lsn __$seqval __$operation __$update_mask __$start_lsn ClientID Given Surname
NULL 0x0000003A000001400002 2 0x01FFFF 0x0000003A000001400003 9738258 HELLO KING
__$start_lsn is the ID of the transaction that makes the changes.
__$operation is the type of change. 1: delete, 2: insert, 4: update
__$update_mask is the bit mask that indicates which columns of the table changed.
__$seqval is for ordering changes if there are more than one changes in a transaction.
__$end_lsn is reserved.
plus all the columns from the original table.
This CT table is updated every 5 seconds and all the changes are kept for 3 days by default.
Assuming ClientID is the primary key for the original table, the following query returns the last change for each transaction.
select *
,sys.fn_cdc_map_lsn_to_time(__$start_lsn) as Change_Time
from
(
select *
,ROW_NUMBER() over (partition by ClientID, __$start_lsn order by __$seqval DESC) as RowNumber
from cdc.dbo_XXX_CT
)A
where RowNumber = 1
To disable CDC
USE MyDB GO EXEC sys.sp_cdc_disable_db GO