Thursday, 21 January 2010

The essentials of data mapping

Understanding where data comes from - which systems it goes through, which business rules have been applied and at which point in it's journey - is critical to improving and maintaining data quality and governance. In large organisations data can often pass through multiple systems before it reaches a data mart used by the business.

I've seen a rise in Bottom up data mapping exercises being conducted, particularly within data-intense departmental processes. The below thoughts discuss this type of exercise.

"You don't know where you're going until you know where you've been" (Unknown)

In order to truly understand data, and whether it is fit for our purpose we need to understand where that data has come from. Not only the system it has come from, but also what has happened to this data along the way.

"The cause is hidden, but the result is known" (Ovid)

When the result is a data quality issue, we seek to identify the root cause of this issue. This can often be a time consuming process. However, by understanding where our data has been, and what has happened at each point of our journey, we speed up the analysis process required to identify the root cause.

"The landscape should belong to the people who see it all the time" (Le Roi Jones)

By understanding where data has been we can gain knowledge of systems it has passed through, and points where it was been 'touched'. The owners of these systems (or processes) should take ownership and responsibility for the data until they pass it along to the next system.

"I'm tired of chasing people" (Robert F Kennedy)

By defining owners we can reduce the amount of time spent chasing for answers. In many organisations there is a common situation where we have a data issue but don't know who to contact about it. We call a help desk and our issue gets passed around to 5 different people. A week later it's still not resolved.

A data mapping exercise can help improve (proactive) communication in two ways. Firstly end users will understand where their data has come from. Therefore they will know who to contact directly if an issue occurs - the owner we discussed previously. Secondly, if the owner knows who they are passing data to, they can alert that person of any known issues prior to passing the issue on.


Anonymous said...


Excellent post.

We data professionals know that in many cases, the mapping from source to target is not fully understood or documented (mapped).

To date, to my knowledge, regulators have not sought details of how information provided to them is gathered (i.e. they have not audited "the essentials of data mapping" that you refer to).

I believe this will change dramatically in the near future. Regulators will audit data governance processes, including the process of data mapping. Regulators may be shocked, and the public at large will definitely be shocked to discover what we data professionals know is a fact of life, namely.

Organisations providing information to a regulator often do not know or understand the journey the underlying data has taken.

I will be presenting on “Achieving Regulatory Compliance – the devil is in the data” at an IDQ Seminar Series event in Dublin next month (Feb 22 2009).

I have requested ideas to help me prepare on my blog.

With your permission, I may quote some of your material in my presentation - naturally giving credit to you.

Rgds Ken

Phil Wright said...


The compliance angle is interesting, glad you bought it up. I agree with you predictions.

Actually one example I can give of where executive level gave support to conduct a data mapping exercise was actually due to compliance. Financial MI folk were giving X as a number, yet the mainframe said Y. No one could give an instant answer as to what systems the data had passed through, and with the auditors soon visiting, this was information that the execs were keen to have at hand.

Please feel free to quote any material you wish. Thank you.

Phil Simon said...


Good post. I agree that understsanding the path that the data took to get to a certain point is valuable. I think that the long-term solution is to make systems less complicated. Perhaps SOA and other Enterprise 2.0 technologies will allow organizations to streamline their operations so data need not go through ten different steps to arrive at its destination point--be it a datamart, stand-alone database, or some other final destination.

I realize that this is much easier said than done. Perhaps data mapping exercises will help organizations understand that they have made things way too complicated? To this end, auditors can ask simple quesitons and broach very legitimate concerns.

Phil Wright said...

Good points Phil, many organisations have made things too complicated. Although, the pressures/timescales of events such as mergers & acquisitions probably don't help matters here, and in some cases were probably the cause.

Post a Comment