Working with Native CDC Support in AWS DMS


Amazon Web Service Database Migration Service is a web service that facilitates the migration of data from a source data store to a target data store. These two extremes are known as endpoints. The migration may be either from two points that use the same database engine like Oracle to Oracle or from different engines like Oracle to PostgreSQL database. The only pre-condition is that one endpoint should be on the AWS service, Further, migration is not possible between two on-premises databases.

In September 2018, Amazon launched a new service AWS DMS CDC (Change Data Capture). CDC enables AWS DMS to control replication from a particular checkpoint thereby having the ability to stop and start replication from any point. Customers on AWS DMS can now use the same process that the database uses for commit sequencing that is LSN (log sequence number). With this launch of AWS DMS CDC, along with the native start point support, users can process changes since the last replication was done.


There are now more opportunities for integration use cases. For example, use Oracle Data Pump or SQL Server BCP for loading data into a target database and then have the DMS log sequence start CDC. Currently, AWS DMS CDC supports Oracle, SQL Server, and MySQL databases as well as Amazon Aurora with MySQL compatibility.

An AWS DMS CDC task captures current changes to the source database during migration. Or a task can be created that captures ongoing changes after the initial full-load migration to a supported data store at target is completed. This is known as ongoing replication or AWS DMS CDC and is done by collating the changes to the database logs with the database engine’s native API.

A CDC native start point is defined as the log of the database engine that sets a time from where the CDC can be started. An ongoing AWS DMS CDC can be started from several points.

· Custom CDC start time – AWS CLI or the AWS Management Console can be used to provide a timestamp from where the replication has to start. AWS DMS CDC starts the current replication process from this preset custom CDC time. AWS DMS then alters the timestamp to a native start point like SCN for Oracle or LSN for SQL. Engine-centric methods are used by AWS DMS to know from where to start the migration activity based on the source engine’s change stream.

· CDC native start point – It is also possible to start from a native point in the source engine transaction log. This approach is often preferred as a timestamp can indicate several native points in the transaction log. This feature is supported by AWS DMS for specific endpoints – SQL Server, PostgreSQL, Oracle, and MySQL. It even allows tracking the timeline to recover failed replication tasks.

AWS DMS CDC ensures that databases are continually updated in real-time with concurrent changes.