Data Migration: Process, Types, and Golden Rules to Follow

In our daily lives, moving information from one location to another is no more than a simple copy-and-paste operation. Everything gets far more complicated when it comes to transferring millions of data units into a new system.

However, many companies treat even a massive data migration as a low-level, two-clicks task. Such an initial underestimation translates to spending extra time and money. Recent studies revealed that 55 percent of data migration projects went over budget and 62 percent appeared to be harder than expected or actually failed.

How to avoid falling into the same trap? The answer lies in understanding the essentials of the data migration process, from its triggers to final phases.

If you are already familiar with theoretical aspects of the problem, you may jump to the section Data Migration Process where we give practical recommendations. Otherwise, let’s start from the most basic question: What is data migration?

What is data migration?

What makes companies migrate their data assets.

Usually, data migration comes as a part of a larger project such as

  • legacy software modernization or replacement,
  • the expansion of system and storage capacities,
  • the introduction of an additional system working alongside the existing application,
  • the shift to a centralized database to eliminate data silos and achieve interoperability,
  • moving IT infrastructure to the cloud, or
  • merger and acquisition (M&A) activities when IT landscapes must be consolidated into a single system.

Data migration is sometimes confused with other processes involving massive data movements. Before we go any further, it’s important to clear up the differences between data migration, data integration, and data replication.

Data migration vs data integration

Data migration is a one-way journey that ends once all the information is transported to a target location. Integration, by contrast, can be a continuous process, that involves streaming real-time data and sharing information across systems.

Data migration vs data replication

Data replication can be a part of the data integration process. Also, it may turn into data migration — provided that the source storage is decommissioned.

Now, we’ll discuss only data migration — a one-time and one-way process of moving to a new house, leaving an old one empty.

Main types of data migration

Six major types of data migration.Six major types of data migration.

Storage migration

  • from paper to digital documents,
  • from hard disk drives (HDDs) to faster and more durable solid-state drives (SSDs), or
  • from mainframe computers to cloud storage.
Many big enterprises still rely on mainframes to run their business processes. Source: TechRepublic

The primary reason for this shift is a pressing need for technology upgrades rather than a lack of storage space. When it comes to large-scale systems, the migration process can take years. Say, Sabre, the second-largest global distribution system (GDS), has been moving its software and data from mainframe computers to virtual servers for over a decade. Its Migration Period is expected to be entirely completed in 2023.

Database migration

So, most of the time, database migration means

  • an upgrade to the latest version of DBMS (so-called homogeneous migration),
  • a switch to a new DBMS from a different provider — for example, from MySQL to PostgreSQL or from Oracle to MSSQL (so-called heterogeneous migration)

The latter case is tougher than the former, especially if target and source databases support different data structures. It makes the task still more challenging when you have to move data from legacy databases — like Adabas, IMS, or IDMS.

Application migration

Data center migration

Business process migration

Cloud migration

Depending on volumes of data and differences between source and target locations, migration can take from some 30 minutes to months and even years. The complexity of the project and the cost of downtime will define how exactly to unwrap the process.

Approaches to data migration

Big bang data migration

In a big bang scenario, you move all data assets from source to target environment in one operation, within a relatively short time window.

Systems are down and unavailable for users so long as data moves and undergoes transformations to meet the requirements of a target infrastructure. The migration is typically executed during a legal holiday or weekend when customers presumably don’t use the application.

The big bang approach allows you to complete migration in the shortest possible time and saves the hassle of working across the old and new systems simultaneously. However, in the era of Big Data, even midsize companies accumulate huge volumes of information while the throughput of networks and API gateways is not endless. This constraint must be considered from the start.

Verdict. The big bang approach fits small companies or businesses working with small amounts of data. It doesn’t work for mission-critical applications that must be available 24/7.

Trickle data migration

Also known as a phased or iterative migration, this approach brings Agile experience to data transfer. It breaks down the entire process into sub-migrations, each with its own goals, timelines, scope, and quality checks.

Trickle migration involves parallel running of the old and new systems and transferring data in small increments. As a result, you take advantage of zero downtime and your customers are happy because of the 24/7 application availability.

On the dark side, the iterative strategy takes much more time and adds complexity to the project. Your migration team must track which data has been already transported and ensure that users can switch between two systems to access the required information.

Another way to perform trickle migration is to keep the old application entirely operational until the end of the migration. As a result, your clients will use the old system as usual and switch to the new application only when all data is successfully loaded to the target environment.

However, this scenario doesn’t make things easier for your engineers. They have to make sure that data is synchronized in real time across two platforms once it is created or changed. In other words, any changes in the source system must trigger updates in the target system.

Verdict. Trickle migration is the right choice for medium and large enterprises that can’t afford long downtime but have enough expertise to face technological challenges.

Data migration process

  • planning,
  • data auditing and profiling,
  • data backup,
  • migration design,
  • execution,
  • testing, and
  • post-migration audit.
Key phases of the data migration process.

Below, we’ll outline what you should do at each phase to transfer your data to a new location without losses, extansive delays, or/and ruinous budget overrun.

Planning: create a data migration plan and stick to it

Step 1 — refine the scope. The key goal of this step is to filter out any excess data and to define the smallest amount of information required to run the system effectively. So, you need to perform a high-level analysis of source and target systems, in consultation with data users who will be directly impacted by the upcoming changes.

Step 2 — assess source and target systems. A migration plan should include a thorough assessment of the current system’s operational requirements and how they can be adapted to the new environment.

Step 3 — set data standards. This will allow your team to spot problem areas across each phase of the migration process and avoid unexpected issues at the post-migration stage.

Step 4 — estimate budget and set realistic timelines. After the scope is refined and systems are evaluated, it’s easier to select the approach (big bang or trickle), estimate resources needed for the project, set schedules, and deadlines. According to Oracle estimations, an enterprise-scale data migration project lasts six months to two years on average.

Data auditing and profiling: employ digital tools

Auditing and profiling are tedious, time-consuming, and labor-intensive activities, so in large projects, automation tools should be employed. Among popular solutions are Open Studio for Data Quality, Data Ladder, SAS Data Quality, Informatica Data Quality, and IBM InfoSphere QualityStage, to name a few.

Data backup: protect your content before moving it

Migration design: hire an ETL specialist

Though several technologies can be used for data migration, extract, transform, and load (ETL) is the preferred one. It makes sense to hire an ETL developer — or a dedicated software engineer with deep expertise in ETL processes, especially if your project deals with large data volumes and complex data flow.

At this phase, ETL developers or data engineers create scripts for data transition or choose and customize third-party ETL tools. An integral part of ETL is data mapping. In the ideal scenario, it involves not only an ETL developer, but also a system analyst knowing both source and target system, and a business analyst who understands the value of data to be moved.

The duration of this stage depends mainly on the time needed to write scripts for ETL procedures or to acquire appropriate automation tools. If all required software is in place and you only have to customize it, migration design will take a few weeks. Otherwise, it may span a few months.

Execution: focus on business goals and customer satisfaction

If you’ve chosen a phased approach, make sure that migration activities don’t hinder usual system operations. Besides, your migration team must communicate with business units to refine when each sub-migration is to be rolled out and to which group of users.

Data migration testing: check data quality across phases

Frequent testing ensures the safe transit of data elements and their high quality and congruence with requirements when entering the target infrastructure. You may learn more about the details of testing the ETL process from our dedicated article.

Post-migration audit: validate results with key clients

Golden rules of data migration

  • Use data migration as an opportunity to reveal and fix data quality issues. Set high standards to improve data and metadata as you migrate them.
  • Hire data migration specialists and assign a dedicated migration team to run the project.
  • Minimize the amount of data to be migrated.
  • Profile all source data before writing mapping scripts.
  • Allocate considerable time to the design phase as it has a high impact on project success.
  • Don’t be in a hurry to switch off the old platform. Sometimes, the first attempt of data migration fails, demanding rollback and another try.

Data migration is often viewed as a necessary evil rather than a value-adding process. And this seems to be the key root of many if not all difficulties. Considering migration an important innovation project worthy of special focus is half the battle won.

Originally published at AltexSoft tech blog “Data Migration: Process, Types, and Golden Rules to Follow

Being a Technology & Solution Consulting company, AltexSoft co-builds technology products to help companies accelerate growth.