Data Migration Part 1 – Know Your Data

Share This Post

If you followed a link to this post, chances are you are planning to replace an obsolete IT system with a robust new system, and you will need to migrate data from the old to the new. No doubt, there is excitement for the new system. Perhaps it is built by an independent system vendor with more resources than were devoted to your current system. Perhaps is based on modern architecture and designed to meet existing and foreseeable requirements. Perhaps the issues with your current system are so dire that almost anything would be better. However, your success with the new system will depend largely on how you migrate data to it.

Since joining ArgonDigital four years ago, I have worked on three very large projects to help customers replace legacy systems: a digital rights system, a financial system, and an asset management system. Each project included a heavy component of data migration from legacy to new systems. I want to share things I’ve learned about the subject, in hopes that you can apply these lessons. I would love to start conversations about data migration with anyone reading this post. You no doubt have stories and experiences I haven’t seen. I may have perspectives that help you understand what you’re seeing. Let’s grow together.

Knowing your Data

A great solution architect on one of my projects had a key insight that drove every action he undertook. He said, 

If you manage data well, you can tackle any project, even if scope is enormous.

Of course, he said, the reverse is also true: if you do not manage data well, you will struggle with even the most scope-limited project, whether it’s an all-new system or a like-for-like replacement! Let’s discuss how you can get clear about data, and how to assess the challenge in front of you.

This high-level process flow illustrates the way successful teams I’ve been privileged to work with approached “knowing their data:”

Data Migrations: Know your Data Flow

I’ll discuss each step, in turn, with stories about the challenges I’ve observed and the approaches these teams have taken.

1.0 Create a BDD

It is common in replacement projects for people who know a current system to jump right into a deep dive into how the legacy systems stores data and start ticking off everything they need to move to the new system. We have found it invaluable to start with a high-level walkthrough of the data as business leaders and subject matter experts (SMEs) see it, rather than how your IT systems do. A visual model that we call the Business Data Diagram (BDD), which shows the relationships between different data objects that your business manages, may reveal surprising gaps in the current state and identify challenging work early for the future state.

Data Migrations Sample BDD
BDD Structure

In a recent project, for example, walking business SMEs through a BDD-building exercise revealed that their partner ecosystem is far more complex than would be evident by analyzing their current system’s schema. There were distinct categories of partner which current systems treated as the same, but which the business needed to manage differently.  Moreover, the BDD exercise revealed major data governance issues, such as:

  • Much of the company’s partner data was not hosted in any system, and data that was kept in systems was not mastered for accuracy and consistency.
  • Data kept outside of systems was not maintained according to company PII policies.
  • Users were frequently forced to re-type partner data into unmanaged text fields, adding work to an already overloaded staff and creating business risk.
  • Partner names and other attributes entered in text fields were often spelled differently (“James Smith”, “James A. Smith”, “James A Smith”, “Smith, James A.”), forcing users to call on database administrators for routine searches like “show me all assets from Jim.”

This had major implications for requirements about what data the company should keep about each different type of partner, who should have access to partner data, and what level of migration effort would be required. The BDD exercise helped the team plan for a major investment of time where we had assumed a very modest commitment would be required, avoiding project slips when clip cleansing efforts would inevitably fall behind optimistic estimates.

See our resources on developing Business Data Diagrams here. The process we followed in this case was:

We started by listing the key Business Data Objects (BDOs) involved.

  • The central object is Partner. Partners are legal entities like corporations or sole proprietorships. There are different types, and a partner may work directly with us or through another partner.
  • Partners have Contacts – people we interact with at the Partner.
  • The company does business with Partners on specific Projects.
  • Through a Project, partners create Assets for the company.
  • Agreements govern activity (plans, deliverables, payments) with a Partner, on Projects.
  • Note: we weren’t discussing current state tables, but rather the things businesses talk about, design process around, set KPIs on, and report on.

We next discussed the relationships between the different BDOs:

  • A Contact works for (connects to) a Partner.
  • An Agreement is signed by (connects to) a Partner.
  • A Project involves a Partner and is governed by an Agreement.
  • An Asset is a child of a Project and is licensed according to an Agreement.

Next, the business team described how BDOs should relate to one another based on how they operated (or wanted to operate) the business:

  • A Partner must have at least one Contact
  • An Agreement must have exactly one Partner. (Other partners may be involved with the work, but are managed indirectly through a primary partner, or sign a separate Agreement.)
  • An Asset must have at least one Agreement but may have more.
  • And so on…

This process allowed us to capture requirements and business rules which we could implement in migration work and in software. It provided a different lens on how to configure off the shelf software, or support in-system interlocks. It directed team focus on bigger picture questions such as how to establish single source of truth systems to manage each BDO, and how to provide for better governance of data in the new systems.

2.0 Prioritize by Data Object

With our BDDs drawn and reviewed, we had a common understanding of data the way the business sees it. We could then turn to assessing data as stored in system tables and fields. Depending on the complexity of legacy and new systems, it is often critical to prioritize this assessment work.

  • On one project, the incoming system was built by an external vendor for a broad range of uses. It boasted some 30-40 major data objects, and 1700 tables with over 10,000 fields!
  • In that system, the Customer object alone had a main table, several support tables, and 300 fields in its default configuration. It would be deployed into an ecosystem with several other systems that tracked customer data. It took months of review and discussion to define which fields we would use, and where the data would come from – for just one data object in the BDD.
  • On another project, there was a Product data object that appeared relatively straightforward with just 15-20 fields. However, the assessment revealed that the source data for those fields would need to come from 6-7 different current state systems. Each different system had its own unique identifier for a product record and tracked the identifier(s) of only 1-2 of the other systems.

It is therefore vital that you assess the size and complexity of migration work for each major business data object up front. Agree as a project team on a priority order of your BDD’s object, for the team to assess. (This is especially important if SMEs will overlap on helping to define multiple BDOs.) Align on this priority with business stakeholders and set expectations for SME involvement in the assessment.

We’ll pick up the conversation in my next post with Step 3.0: By Data Object, Map in Source & Target Systems.

More To Explore

Visuals in Requirements Mapping

In Praise of Requirements Mapping

Learn how to tie software requirements together with visual models and other artifacts created during the analysis process.

It’s a Matter of Trust

The combination of pandemic and moving to a rural community has increased the amount of shopping I do online, but even before those events I found myself depending more and