We all assume that the data in our source system is perfect and ready to migrate. However, sooner or later we do realize that it’s not the case.
New systems have new rules and the legacy data may be violated with the intrusion of source data. For example, a contact email can be mandatory in the new system, but not in a 20-year-old legacy system. As a result, the contact email section will be seen as blank fields in the source system. Since you have not analyzed the data properly and done data cleansing correctly the expected value could not pass from the source system and reflect on to the target system. In this type of case, you will end up with numerous blank fields in the system leading to a migration failure.
It’s also very important to understand that Mines can be hidden in historical data. If for example, your source system is using European currencies that do not exist anymore, it needs to be converted to Euros before migrating. Identify such errors and convert them before migrating to the legacy system.
These are very practical examples of what we’ve faced with a lot of our clients and should be taken into account during data quality checks, data analysis, and data cleansing. So data cleansing is not only about missing records but it’s all about ensuring whether the records in the data system are consistent and that it doesn’t hamper your scripts when you migrate them from your source system to the target system.
Data quality significantly influences the effort. So don’t take the risk of wasting your efforts by running the scripts in a hurry and do a proper analysis before transferring them. The further you go down the history, the bigger the mess you’ll discover because there may be so many changes that happened in the new system that was not present in the old legacy system.
Hence in the early stage, it’s crucial to decide how much history we want to transfer to the new system. For instance, if you have 100,000 records in total, maybe when you talk to the business users you may find out that not all 100,000 records make sense to them, and only 30,000 records need to be migrated. This way by talking to the business users you have just decreased the workload by 70%. Since you don’t have to migrate everything, your scope and efforts have now been reduced. Therefore, it’s highly suggested to involve stakeholders in the process and do a proper data analysis of both systems.
Pre-Migration Data Health Check
In-depth data analysis, data cleansing, and understanding of the legacy system are necessary for a smooth rollout. You must do a pre-migration data health check which is a data health check on both the source system and the target system. So make sure the project has time dedicated to removing unused, outdated, duplicate, and incorrect data at multiple stages and consider this a must step for a successful migration.