One of the pressing concerns for businesses in today’s data-driven environment is data security and safety, preventing data breaches and hacking, and ramping up existing security to preserve and protect historical data. One of the ways opted by the leading database platforms like Microsoft SQL Server and Oracle is the introduction of various solutions like complex queries, timestamps, data audits, and triggers.
Microsoft was first of the mark in 2005 with its “after date”, “after delete”, and “after insert” features that kick started SQL Server CDC. However, there were certain flaws that needed to be ironed out and in 2008 launched a modified version of change data capture that was very effective.
SQL Server CDC tracks and captures all the changes that are made on the SQL Server database tables without using any other applications or programming. Till 2016, the SQL Server CDC feature was available only in the Enterprise Edition of the SQL Server but later was made standard across all versions.
SQL Server CDC records all activities that are applied to a SQL Server table like insert, update, and delete. The details of the changes are available in an easy-to-understand relational format. Column information and metadata required for applying changes to the target database are captured or the modified rows and stored in change tables that replicate the structure of the columns in the tracked source tables.
Apart from the Insert, Update, and Delete operations that are tracked by SQL Server CDC, all changes are recorded in a mirrored table with the same column structure as the source tables.
Some additional columns in the following format also track the changes made at the source database.
•One record is written by the SQL Server showing the inserted values for every Insert statement
•One record is written by the SQL Server showing the deleted values for every Delete statement
•Two records are written by the SQL Server for every Update statement. One shows data before the changes and the second after the changes are made.
The following data are present in the additional columns.
•__$start_lsn and __$end_lsn that show the commit log sequence number (LSN) assigned by the SQL Server Engine to the recorded change
•__$seqval that shows the order of that change related to other changes in the same transaction, __$operation that shows the operation type of the change, where 1 = delete, 2 = insert, 3 = update (before change), and 4 = update (after change)
•__$update_mask that is a bitmask defined for each captured column, identifying the updating columns
Two jobs are created by the SQL Server CDC – one populates database change tables with the changed information while the other cleans the change tables by deleting records older than the configurable retention. Changes that are tracked by the SQL Server CDCcan also be loaded from the OLTP source to the data warehouse OLAP source with T-SQL or ETL processes.