We make Web apps, Mobile apps and Data analytics happen !
Data Migration
Data Migration ? • Data migration is the process of transferring data. It has to be considered before implementing, upgrading or consolidating a new system. • It is usually performed programmatically to achieve automated migration, freeing up human resources from tedious tasks. • It occurs for various reasons such as server replacements, maintenance or upgrade, application migration, website consolidation and data center relocation.
• To achieve an effective data migration, old system data should be mapped to new system by properly designing data extraction and loading procedures. • This design relates old data formats to new system's formats. Data migration may involve many phases including data extraction where data is read from old system and data loading where data is written into new system.
Data migration Steps
Analyze Design Extraction Cleansing Load Verification
Analyze field level mapping Excel File
Flat Files/E xcel
Data base
Applicati on
2.Extracted data from Source
3.Cleansing & evaluation 6.Repeat process till error free data migration Storing Area 5.Moving data to destination 4.Load data using upload programs db & Verification process
Destinatio n Database
Data Upload Programs
Excel File
Analyze
• It’s important to define the scope by discussing with key users and stakeholders. this is similar as gathering requirements but it also includes comparing with old systems data and to find out where it is stored. • By proper analysis , it can be defined what needs to be migrated and the scope. No need of junk data.
Design • Define the mapping and discuss with business users. Microsoft Excel is a very good tool to define mapping and to link requirements. • Keep the data model as simple as possible in every migration. This makes it easier to trace errors and to understand the script, but also to run an update. • When you use files, you should use one folder with a unique file name. For every migration, you can specify a filter for the file name.
Extraction • The first part of the process involves extracting the data from source system(s). • In many cases this represents the most important aspect of data migration, since extracting data sets the stage for success of subsequent processes. • In data extraction database structure has to be analyzed with all possible scenarios and its mapping relations.
Cleansing • Data cleansing is the process of detecting, correcting or removing incomplete, incorrect, inaccurate, irrelevant, out-of-date, corrupt, redundant, incorrectly formatted, duplicate, inconsistent, etc. records from a record set, table or database. Parsing
• Steps in Data Cleansing Consolidating
Matching
Correcting
Standardizing
Loading • This phase loads the data into destination that may be a simple delimited flat file. Depending on requirements of the organization, this process varies widely. • Import data into destination table or database with its own mapping relations.
Verification • After loading into new system, results has to be verified to determine whether it is accurately translated, completed, and supports processes in new system. • During verification, there may be a need for a parallel run of both systems to identify areas of disparity and forestall erroneous data loss. • For applications of moderate to high complexity are commonly repeated several times before the new system is deployed.
Difficulty Criteria o o o o o o o o o o o o o
Number and size of databases in Application Number of Tables per database Total Number of Attributes % of attributes that have had multiple definitions over time % of attributes in terms of synonyms and antonyms Number of DB dependent processes Number of one time Interfaces Number of ongoing Interfaces Number of Data Quality problems and issues to fix Knowledge/Documentation of Data Quality issues Ease of de-duping similar entities in the same DB Ease of matching same entity records across multiple DBs Completeness of the functional documentation
Our Successful Implementation Environment • Source :PHP, MySQL o Table count:205 • Destination: Asp.net MVC, SQL 2014 R2 o SAAS Model Table count:(Main db with 29 tables, domain db with 42 tables)
Main db
SAAS Model Source DB(MySQL)
Data Migration Process Client dbs
1. Define the mapping and discuss with business users. Microsoft Excel is a very good tool to define mapping and to link requirements.
2.Extraction of data from MySQL is coded in python and stored the resultant data in excel. 2.1. Data fetched from individual tables based on the analysis, that consists of valid and invalid data. Remove all inconsistent, duplicate and test/dummy data. 2.2. Once these data are fetched we restructured those data based on the target database tables and exported into excel sheet with its mapping relations. 2.3 Then these data can be moved to cleansing process.
Python sample code:
3. After completion of extraction, moving for cleansing process such as Detecting, correcting or removing Incomplete Incorrect Inaccurate Irrelevant Corrupt Out-of-date Inconsistent Duplicate Redundant
4.Once cleansing process was successfully completed then initiate the loading process. 4.1 Programmed a script in SQL to migrate the cleansing data into destination or targeted tables. 4.2 Have to maintain primary key for relational purpose for other tables.
SQL Sample Script:
• 5. There may be a need for a parallel run of both systems to identify areas of disparity and forestall erroneous data loss. • 6. Check the data flow of an application. Till we get error free application this has be repeated.
Excel File
Data Base(MySQL)
Analyzed Mapping & Mandatory fields (Analyzed and design sheet)
Application(P HP)
2.Extracted data from Source using Python
3.Cleansing & evaluation 6.Repeat the process till error free data migration Storing Area 5.Moving data to destination db & Verification process sql 4.Load data using upload programs SQL script process Destinatio n Database
Data Upload Programs
Excel File
Challenges faced 1. Data Migration for SAAS model(1 db to many subdb). Soln: script programming created unique subdomain with existing column values every client and create a sub domain database based on respective tables.
2. Understanding database structure from source is biggest challenges as it might have unused tables. 3. Reporting feature has to be redesigned in new system. Soln: With Repeated testing required to find exact column name.
4. Chances of getting wrong data for wrong column as naming convention was improper.
Contact us • Data migration projects are not as easy as new development and it requires skill set, knowledge. • W2S solutions worked on many migration projects and capable of handling projects with massive data. • • • •
Let us know how we can help. Email id: Sales@w2ssolutions.com Phone No: +1 – 512-375-4345 Web: www.w2ssolutions.com