The Amazon TimeHub team built a custom data validation framework on top of AWS DMS validation tasks to maintain data integrity during data replication between source and target databases.
AWS DMS provides two data validation options: validation with ongoing replication and standalone validation only tasks that validate data independent of replication tasks. Amazon TimeHub's team chose the latter to maintain isolation between the functionalities.
The team encountered limitations with AWS DMS data validation such as a high number of false positives and revalidation of logged errors not being possible. As a solution, the team built a custom revalidation framework to eliminate false positives and came up with a manual correction approach to overcome the limitations.
Amazon TimeHub's team also explored the possibility of validating partial data using table-level filters in cases when replication is disrupted and tasks need to be restarted.
Through the custom framework built on top of AWS DMS validation tasks, operational teams can maintain data integrity during ongoing data replication between source and target databases, avoiding data integrity issues due to unplanned failures at the source database, AWS DMS, or target database.