Wednesday, 4 January 2012

Understanding, Improving & Controlling the Data Landscape - Part 2

Last time around I outlined the core components that make up the Understanding stage. Now that we have ascertained a good level of understanding of the landscape - who is using data, and how it is being consumed, how the data is derived, including source to target mappings, and how it's structured - we can proceed to look at what should be considered during the Improving stage.

The Improving stage consists of a number of core components that when addressed will help towards creating an environment with greater control, confidence and clarity of data.


Often a contentious subject in many organisations, but required under a number of Regulatory Compliance acts, data should be owned to aid accountability and responsibility. People are often hesitant or defensive when asked to take ownership, and one of the reasons for this is due to the fact that data can be touched at many points along the journey from source to target. Would you like to take ownership over a data item, or a report, if you were not directly responsible (or for that matter, had little knowledge) for how the data was derived? During the Understanding phase an insight was provided into how data is derived, and who is consuming this data, which in turn is valuable information when undertaking the task of assigning ownership to data, reports & systems.

Data Quality

Both by profiling the data, and by understanding the source to target mapping of data, potential areas to address will have been exposed. Potential exposure could range from data incorrectly being included/excluded in key reports, data not conforming correctly to pre-defined business rules, or missing/incomplete data items. Working with both technical and business users these areas can be improved to ensure expected delivery of data to agreed and strictly defined standards.

Data Definitions

It may be apparent that different users are using different definitions when talking about the same data item, or the same report. This can be common within organisations where the consumers are from different departments and potentially in different locations. This may be because the data, or report was never assigned definitions, or that a published definition has become outdated due to a business rule change, but this change hasn't been communicated to one of the consuming parties. Consistency and accuracy of definitions will aid improved understanding around the context of data, and ensure that the correct data items, reports and systems are utilised within critical information production.

There are many ways in which to facilitate improvement within the key areas highlighted above, examples of which will be discussed in future posts. Once Improvement has been implemented, the focus is then on ensuring that the improvements remain in place. Methods of ensuring that this happens will be discussed in the next post, looking at the Controlling Stage.

Thursday, 29 December 2011

Understanding, Improving & Controlling the Data Landscape - Part 1

I have been involved in a number of different data related projects where the business problem can be summarised in a similar way.

"We have poor quality and understanding of data within our organisation/a key business process, and we require confidence in the data we are producing"

Within these type of projects the approach has been fairly standardised. The first step has been to gather an Understanding of the current data landscape. Once understanding has been achieved the next step is to go about Improving the landscape. Once improvements have been implemented the emphasis is on Controlling the improvement to ensure that the measures implemented remain in place, and a successful outcome of the project is met.

Within this post I want to outline the core components of Understanding the data landscape. Each has equal importance and should be seen as a complementing partner to the next. I would consider an exercise where only one component is achieved to be lacking in the aim of providing a complete understanding of the data landscape.

What are the key components within the understanding phase?

Data Usage

How is data currently being consumed within the organisation? Which data warehouse, report or operational systems are currently being used to provide insight into sales, risk or performance? The idea is to identify both which sources are being used as well as who is using them. The aim of this exercise is to allow a high level picture to be developed of systems that are critical to the provision of data within an organisation, as well as the scope of this provision.

Data Mapping

After completing the above exercise we know how data is being used within our organisation, but we may not be currently aware how data on a report is derived. The aim of the Data Mapping exercise is to understand the lifecycle of data, from it's creation at source, all the way to it's appearance on a report. We require to understand how data flows through systems and what happens to the data at each stage of the journey. For instance, how is it received, what transformation is undertaken, and how is it loaded. Is it subject to any standardisation or additional business rules, and is it aggregated at any stage? The results from this process will provide both a graphical and detailed understanding of data as it passes through the organisation, and how it is touched along the way.

Data Profiling

The profiling of data will help us to further understand how it is structured, how it adheres to standards and to identify any potential data quality issues which may impact the accuracy of reporting further down the line. The process of Data Profiling is the technical accompaniment to the Data Usage and Data Mapping exercises highlighted above, and is necessary in order to provide a concise picture of the data landscape.

In future blog posts I will cover methods that can be utilised to present this information to business users, as well as methods to prioritise focus areas which will influence and form the next stage of the project: The Improving Stage.

Wednesday, 5 October 2011

Getting the most out of your audience

Forums, whether called a 'Data Governance Forum', 'Data Stakeholder Forum' or 'Data Quality Forum' are generally a good method to aid communication between your Data Quality Function and the wider business community. Key data stakeholders from throughout an organisation would periodically meet to discuss challenges, issues, roadmaps and strategic data alignment.

How to conduct a forum and what should be the scope of these meetings are in themselves items of great debate, however, here are three simple suggestions to improve the chances of success:

  • Don't use the forum to dictate your vision. The purpose of the forum should be on discussion, with the aim of co-operation to enable you to realise your vision as a Data Quality leader. If you walk into your first meeting stating that the business are accountable for the poor quality data in the organisation, and they need to start taking ownership and responsibility to improve it, chances are your next forum will be sparsely attended. Instead of dictating strategic direction learn to understand the business challenges and the history of data use within the organisation. You are not going to realise your vision without the support of the wider community.

  • Don't blind your audience. If your audience (primarily the business) don't understand you, they will feel confused and distance themselves from your forum. Know your audience, identify and understand their needs, paraphrasing in a language they understand. If you start talking about Informatica Mappings, Data Types and Levenshtein Distance, chances are the business attendees will see this as a technology forum and neglect to attend your next meeting.

  • Use the forum as a means of gaining feedback. What's going well? What isn't going well? What could we do better? The idea is to receive feedback and constructive criticism from your stakeholders in order to improve the service that you provide. Often it is only when the business have confidence in your service and abilities to support them in business as usual activities that they will buy in to your strategic data management initiatives.

Wednesday, 28 September 2011

Assessing the Plot

If you were ever to undertake the task of building a house on an empty plot of land there are a number of factors that you would want to consider.

Before you even start to lay the important foundations you would want to undertake a survey of the Ground Quality. This would typical involve a bore hole analysis to assess the type and quality of the ground. Alongside the bore hole analysis you'd want to analyse the landscape of the plot. For example, are there tree roots close to the planned structure that could impact foundations? Does the plot have sufficient access to required services such as water and sewerage?

In the world of construction, if you want to ensure that you are building something which will be successful, have longevity and contain no hidden surprises you would ensure that the ground work is undertaken in advance.

In Data Migrations, CRM Implementations and Data Warehouse projects you often read assumptions in project documents such as 'the data is assumed to be fit for purpose and adhering to the relevant business standards'. How often is this assumption found to be incorrect, leading to delayed project completion or poor user adoption?

An equivalent of the bore hole analysis would be a data profiling exercise that ascertained what the key data items were, what good quality data looks like and how data performs against these expectations.

An equivalent to a landscape analysis would be to ensure that the system architecture can support both current and future demands, that the right people are in place, and that any required change could be easily undertaken.

These items are key components to a successful implementation but all too often they do not get the time that they deserve on the project plan.

Why are we not taking the opportunity during these transformational projects to question the data and architecture that is relied upon to make the project successful? Why do we all too often assume?

Wednesday, 5 May 2010

How are you Executing your Data Quality Strategy?

There has been lots of talk about Data Quality Strategy - the framework, and roadmap we will use to implement data quality and governance measures within an organisation - and the criteria that defines success. Within our Data Quality Strategy, we laid down our goals, our objectives and our success criteria. We know where we want to be, and we know how to judge whether we have succeeded to get there or not.

But how do we get there?

Defining the strategy is only the first step. The hard work is in its execution. Many well devised and well meaning strategies come undone due to poor execution. I recently watched a great short podcast from London Business School about Strategy Execution, or as they put it 'getting things done'. The professor, Donald Sull, suggested that Strategy Execution can be broadly placed into three buckets:
  • Executing by Power
  • Executing by Process
  • Executing by Promise
Lets take a quick look at an example of each.

Executing by Power

Our Data Quality Strategy states the need to profile enterprise data, and through the utilisation of our data quality profiling tool we will identify rogue elements and improve the quality of the data.

Meanwhile, the issue of poor quality customer address data has been raised to a C-Level audience, with the impacts of poor data quality known to be costing the company around £250,000 per annum in returned mail processing.

Now that the C-Level audience is aware of the issue, and the cost to the business, they are extremely keen to see the data issues resolved. They see it as the responsibility of your team, and deem it to be your number 1 priority to resolve.

Executing by Process

Our Data Quality Strategy states the need to ensure that all data sources have owners, that we have understanding of Data Lineage, and that Data Retention Policy is agreed and adhered to.

Meanwhile, there are a number of Legal Regulations that impact our organisation, and we must ensure continual compliance. As part of this process there is a need to ensure that the flow of data is understood throughout the journey from system to system, and that ownership and single points of contact are in place.

By aligning our strategic objectives such as those mentioned above to the process of achieving Regulatory Compliance we can ensure that they are executed effectively.

Executing by Promise

Our Data Quality Strategy states that we will create a scorecard to measure the quality of data within our organisation. We will then embark upon a continuous improvement exercise, focussing primarily on the key 'weak' areas as identified by our measurements.

Meanwhile, the business community have a number of data quality issues that they need assistance with. They don't know the true cost of the issues so haven't been able to escalate it to a C-Level audience. They are however keen to have all issues documented, and resolved.

At this stage, we know what we'd like - good quality, and fit for purpose data - but we're not sure exactly how we're going to get it. By making a commitment to the business to improve quality of data, we are handing ourselves accountability to get the task completed. This promise allows us to be flexible in our approach but still maintain high standards of service & delivery to aid our reputation.

Which is the best method?

I'd suggest that any Data Quality Strategy will utilise a number of different execution approaches in order to achieve strategic objectives.

As well as having it's strengths, each execution approach has it's own weaknesses, be it the potential of becoming silo'd, the lack of flexibility, or even the potential to dampen peoples initiative. Your task is to select the best method of execution for each objective within your strategic framework.

In essence, ask yourself: How am I going to successfully implement initiatives stated within our strategy?

Thursday, 22 April 2010

How do you identify your strategic data?

As I mentioned towards the end of my last blog post, many organisations face "information overload". Over time there has been so much data collated, some of which is no doubt duplicated and used by different people in different ways. It may be stored on disparate systems around the organisation, and it may be touched by numerous systems or processes on it's journey from source system to the report on your desktop.

Firstly, How do we separate the data that is "nice to have" from the data that is "critical" to our business? Secondly, how do we govern this critical data accordingly, as a valuable business asset?

Think Strategically

To separate the "nice to have" from the "critical" we need to understand the aims & objectives of the business that we are serving. We need to align our strategy for data provision with the wider corporate strategy and attempt to understand exactly what data will be required to support and achieve the business goals.

If a strategic goal of the business is to enter a new geographical market, do we currently have the data within our organisation that will support the marketing, promotion and launching of a product in this market? If not, what data do we need to start collecting, and from where?

How do we know what we need?

Work with data stewards and subject matter experts within your business community. This network of enthusiastic and knowledgeable individuals were chosen especially because of their knowledge and influence within the business. Utilise them. If you don't have a network of Data Stewards and Subject Matter Experts within your organisation, it would be a worthwhile exercise to identify and involve them.

Create and sustain a 'Strategic Data Forum' that meets on a regular basis to discuss how strategic business goals can be supported by data. Again, we need to have a clear understanding of what data we currently have within the organisation, identify any gaps that need to be filled, and ensure a roadmap exists so that we can track activities and objectives.

There are also a number of other challenges that should be addressed within the forum, for instance:
  • How are data sources communicated to the business community?
  • How will our current technical infrastructure be able to cope with any additional data requirements?
  • Is strategic data currently produced by a 'cottage industry' on an unreliable server - do we see that this needs to move to a more strategic platform for business continuity, scalability & increased governance?
This is it(erative)

Like any strategic exercise, this isn't a one off. This is an extremely iterative process. It will grow and adapt based upon the strategic goals of the business. You may find that what is deemed as 'strategic data' will be re-defined further down the line, but the important thing is that you have the people, and the process in place to manage change.

Monday, 19 April 2010

The importance of research before bringing data to market

When we talk about 'bringing data to market' we are talking about the process of taking an idea, or a request, and turning it into a stable, trusted source of data that can be utilised by the business community in order to aid decisions, and support the strategic goals of the business.

If 'bringing data to market' was treated in a similar way to how you would bring a product to market, we would see data created that has been:
  • clearly defined and aligned to strategic goals
  • fully tested and quality assured
  • documented ready for use
Lets think for a minute about one of the most important steps undertaken prior to launching a new product. Research. In product terms, once an idea has been generated the product team would undertake detailed research. This research will provide good knowledge and enable a strong vision of how to take the product forward towards a successful launch.

It starts with an idea, or a request for data.

A member of the business community may request that "We need data on X in a format such as an OLAP cube, so that we can slice/dice the data".

The idea, or request for data is then benchmarked against a number of research tasks.

1. Why do the business need X?
  • What is driving the business requirement for the data?
  • How does this align to the wider business strategy?
  • Do they already have X but are unaware that it exists?
2. Who will use X?
  • Will X only be used by the requester?
  • Does X have the potential to be utilised across other business areas?
  • Do the potential users of X have the correct systems access/skillset to utilise the data?
3. Does similar data already exist?
  • Will Y meet the needs of the requester of X?
  • Why was Y not previously considered by the requester?
  • Do they actually want Y, but in a slightly different format?
This research will ensure that good knowledge exists of the requirement, and how it fits into the strategic goals and direction of the business. It will also help us to understand the potential user base of the data, and whether they have the required skills to utilise the data, or whether further user training will be required.

Finally, it will also aid to reduce a problem that occurs in many organisations. Business users are continually requesting data from BI, and in many cases the data actually already exists and is utilised by other parts of the business, but for whatever reason (poor communication? poor 'data launch'?) the requester was not aware of the existing data. In less governed and communicative organisations this can result in multiple data sources that essentially provide the same data. There is a risk that these data sources may not be used in the same way, or even reconcile with each other due to potential criteria imposed during their creation.

Do your research

Detailed research and knowledge, combined with a strong strategic vision will ensure that all data that you 'bring to market' will be a valuable enterprise asset. It will help prioritise business requests, and place context onto data in terms of which strategic goals the data is supporting. In an age where we often hear the words "information overload" it is crucially important that we don't lose sight of the bigger picture, or the business goals, that we are supporting through data.