Wednesday, 5 May 2010

How are you Executing your Data Quality Strategy?

There has been lots of talk about Data Quality Strategy - the framework, and roadmap we will use to implement data quality and governance measures within an organisation - and the criteria that defines success. Within our Data Quality Strategy, we laid down our goals, our objectives and our success criteria. We know where we want to be, and we know how to judge whether we have succeeded to get there or not.

But how do we get there?

Defining the strategy is only the first step. The hard work is in its execution. Many well devised and well meaning strategies come undone due to poor execution. I recently watched a great short podcast from London Business School about Strategy Execution, or as they put it 'getting things done'. The professor, Donald Sull, suggested that Strategy Execution can be broadly placed into three buckets:
  • Executing by Power
  • Executing by Process
  • Executing by Promise
Lets take a quick look at an example of each.

Executing by Power

Our Data Quality Strategy states the need to profile enterprise data, and through the utilisation of our data quality profiling tool we will identify rogue elements and improve the quality of the data.

Meanwhile, the issue of poor quality customer address data has been raised to a C-Level audience, with the impacts of poor data quality known to be costing the company around £250,000 per annum in returned mail processing.

Now that the C-Level audience is aware of the issue, and the cost to the business, they are extremely keen to see the data issues resolved. They see it as the responsibility of your team, and deem it to be your number 1 priority to resolve.

Executing by Process

Our Data Quality Strategy states the need to ensure that all data sources have owners, that we have understanding of Data Lineage, and that Data Retention Policy is agreed and adhered to.

Meanwhile, there are a number of Legal Regulations that impact our organisation, and we must ensure continual compliance. As part of this process there is a need to ensure that the flow of data is understood throughout the journey from system to system, and that ownership and single points of contact are in place.

By aligning our strategic objectives such as those mentioned above to the process of achieving Regulatory Compliance we can ensure that they are executed effectively.

Executing by Promise

Our Data Quality Strategy states that we will create a scorecard to measure the quality of data within our organisation. We will then embark upon a continuous improvement exercise, focussing primarily on the key 'weak' areas as identified by our measurements.

Meanwhile, the business community have a number of data quality issues that they need assistance with. They don't know the true cost of the issues so haven't been able to escalate it to a C-Level audience. They are however keen to have all issues documented, and resolved.

At this stage, we know what we'd like - good quality, and fit for purpose data - but we're not sure exactly how we're going to get it. By making a commitment to the business to improve quality of data, we are handing ourselves accountability to get the task completed. This promise allows us to be flexible in our approach but still maintain high standards of service & delivery to aid our reputation.

Which is the best method?

I'd suggest that any Data Quality Strategy will utilise a number of different execution approaches in order to achieve strategic objectives.

As well as having it's strengths, each execution approach has it's own weaknesses, be it the potential of becoming silo'd, the lack of flexibility, or even the potential to dampen peoples initiative. Your task is to select the best method of execution for each objective within your strategic framework.

In essence, ask yourself: How am I going to successfully implement initiatives stated within our strategy?

Thursday, 22 April 2010

How do you identify your strategic data?

As I mentioned towards the end of my last blog post, many organisations face "information overload". Over time there has been so much data collated, some of which is no doubt duplicated and used by different people in different ways. It may be stored on disparate systems around the organisation, and it may be touched by numerous systems or processes on it's journey from source system to the report on your desktop.

Firstly, How do we separate the data that is "nice to have" from the data that is "critical" to our business? Secondly, how do we govern this critical data accordingly, as a valuable business asset?

Think Strategically

To separate the "nice to have" from the "critical" we need to understand the aims & objectives of the business that we are serving. We need to align our strategy for data provision with the wider corporate strategy and attempt to understand exactly what data will be required to support and achieve the business goals.

If a strategic goal of the business is to enter a new geographical market, do we currently have the data within our organisation that will support the marketing, promotion and launching of a product in this market? If not, what data do we need to start collecting, and from where?

How do we know what we need?

Work with data stewards and subject matter experts within your business community. This network of enthusiastic and knowledgeable individuals were chosen especially because of their knowledge and influence within the business. Utilise them. If you don't have a network of Data Stewards and Subject Matter Experts within your organisation, it would be a worthwhile exercise to identify and involve them.

Create and sustain a 'Strategic Data Forum' that meets on a regular basis to discuss how strategic business goals can be supported by data. Again, we need to have a clear understanding of what data we currently have within the organisation, identify any gaps that need to be filled, and ensure a roadmap exists so that we can track activities and objectives.

There are also a number of other challenges that should be addressed within the forum, for instance:
  • How are data sources communicated to the business community?
  • How will our current technical infrastructure be able to cope with any additional data requirements?
  • Is strategic data currently produced by a 'cottage industry' on an unreliable server - do we see that this needs to move to a more strategic platform for business continuity, scalability & increased governance?
This is it(erative)

Like any strategic exercise, this isn't a one off. This is an extremely iterative process. It will grow and adapt based upon the strategic goals of the business. You may find that what is deemed as 'strategic data' will be re-defined further down the line, but the important thing is that you have the people, and the process in place to manage change.

Monday, 19 April 2010

The importance of research before bringing data to market

When we talk about 'bringing data to market' we are talking about the process of taking an idea, or a request, and turning it into a stable, trusted source of data that can be utilised by the business community in order to aid decisions, and support the strategic goals of the business.

If 'bringing data to market' was treated in a similar way to how you would bring a product to market, we would see data created that has been:
  • clearly defined and aligned to strategic goals
  • fully tested and quality assured
  • documented ready for use
Lets think for a minute about one of the most important steps undertaken prior to launching a new product. Research. In product terms, once an idea has been generated the product team would undertake detailed research. This research will provide good knowledge and enable a strong vision of how to take the product forward towards a successful launch.

It starts with an idea, or a request for data.

A member of the business community may request that "We need data on X in a format such as an OLAP cube, so that we can slice/dice the data".

The idea, or request for data is then benchmarked against a number of research tasks.

1. Why do the business need X?
  • What is driving the business requirement for the data?
  • How does this align to the wider business strategy?
  • Do they already have X but are unaware that it exists?
2. Who will use X?
  • Will X only be used by the requester?
  • Does X have the potential to be utilised across other business areas?
  • Do the potential users of X have the correct systems access/skillset to utilise the data?
3. Does similar data already exist?
  • Will Y meet the needs of the requester of X?
  • Why was Y not previously considered by the requester?
  • Do they actually want Y, but in a slightly different format?
This research will ensure that good knowledge exists of the requirement, and how it fits into the strategic goals and direction of the business. It will also help us to understand the potential user base of the data, and whether they have the required skills to utilise the data, or whether further user training will be required.

Finally, it will also aid to reduce a problem that occurs in many organisations. Business users are continually requesting data from BI, and in many cases the data actually already exists and is utilised by other parts of the business, but for whatever reason (poor communication? poor 'data launch'?) the requester was not aware of the existing data. In less governed and communicative organisations this can result in multiple data sources that essentially provide the same data. There is a risk that these data sources may not be used in the same way, or even reconcile with each other due to potential criteria imposed during their creation.

Do your research

Detailed research and knowledge, combined with a strong strategic vision will ensure that all data that you 'bring to market' will be a valuable enterprise asset. It will help prioritise business requests, and place context onto data in terms of which strategic goals the data is supporting. In an age where we often hear the words "information overload" it is crucially important that we don't lose sight of the bigger picture, or the business goals, that we are supporting through data.

Thursday, 15 April 2010

Utilising the DICE Framework in Data Governance Initiatives

Your Data Quality or Data Governance roadmap will undoubtedly contain a number of enterprise initiatives that you wish to implement. These initiatives will result in change within your organisation, be it a change of organisational culture, a change of process, or even a change of systems.

The Data Quality space is an area where investment may have traditionally been low due to the difficulty of demonstrating ROI, and although smarter executives and business users understand the need for Data Quality Management & Data Governance, unless bitten many executives are still unwilling to invest time & capital in preventative and governing measures.

Due to the above, we may be presented with the challenge of gaining business & executive-level support. The sensible way to build momentum in gaining support is to demonstrate quick wins. Look for business problems in the DQ space that could be solved within a reasonable amount of time, with a manageable level of effort. Add value to the organisation, and you will see the momentum, and support for your efforts grow.

With this in mind it would make sense to attempt to measure the likeliness of implementation success prior to diving into any initiative.

(Naturally, you will never truly be aware of the chances/degree of success until an initiative is underway. However, indepth planning, analysis & risk mitigation will lead to greater chances/degrees of success.)

We can use tools for this

This is where tools such as DICE come in. The DICE Framework was created by the Boston Consulting Group and can be used to evaluate the likeliness of change management success.

Rather than trying to evaluate likeliness of success by looking at "soft factors" involved in successful change management, such as motivations, leadership & current organisational culture, the DICE framework looks only at what it calls "hard factors" that influence change success.

uration - either the duration until completion or the time between key milestones
Integrity - ability of the team to successfully perform & complete the project
Commitment - backing from the executive team, and support from the business community
Effort - the amount of effort required above the regular workload of the business

By scoring each of these factors, the framework will allow you to chart your change management project success likeliness, much like on the online tool provided by BCG.

Lets look at an example

Lets take the following initiative that features on your Data Quality Roadmap as an example - "Establish a Data Quality Resolution Centre".

The aim of the DQ Resolution Centre is to:
  • provide a single point of contact to raise data quality issues
  • promptly resolve all data issues
  • allow the business to track issue resolution progress
  • provide expertise & education to the business
  • create regular KPI reporting relating to data issues
It has been estimated that it would take around 1 month to create the resolution centre, establish the key processes, identify the key people, communicate the change, and agree KPI reporting. The newly employed 'Data Quality Manager', a person with solid experience in data quality resolution, and a strong grasp of the business will be taking ownership of the change.

In addition, there is support from senior management, although they refused to sign off the supporting 'Data Quality Analyst' role requested by the Data Quality Manager. The business community are extremely supportive, as previously their data issues have become lost in the world of IT. They've longed for someone "on the ground" to support them in their requirement for high quality and meaningful data.

Using the previously mentioned online tool and the information above, we can see that:

Duration: less than 2 months
Team Performance Integrity: Very Good
Commitment (Senior mgmt) : Seem to want success
Commitment (Local): Eager
Effort: Less than 10% additional

This suggests that the initiative is likely to be "highly successful".

Whether the Data Quality Resolution Centre is seen as "highly successful" by the business community six months down the line is however a further challenge, and a story for another day.

In Conclusion

By utilising the DICE framework, alongside evaluation of the softer factors that influence change success, you will be more prepared to aid in the selection, planning & prioritisation of change projects. This is particularly important if you are under pressure to succeed or risk having funding reassigned to another project. If 4 items of your roadmap are deemed to be of equal importance to the business perhaps it would be worth tackling an item which seems to have a higher chance of success first?

Tuesday, 6 April 2010

What makes a successful Business Intelligence Leader?

While reading "On the Good Life", a collection of some of the works by the Roman philosopher & statesman, Cicero, I came across a couple of statements that made me think about Business Intelligence leaders. I use the term "Business Intelligence leaders" but this can apply to any leader responsible for data and information management/usage within an organisation, such as BI Heads, Customer Insight Managers, Data Quality Managers and so on.

"A successful statesman, the person who guides the nation and controls its policy, may be defined as an individual who knows and employs the means of securing and promoting the interests of his country."

Lets break this down a bit:

"A successful statesman"

You - the Business Intelligence leader.

"the person who guides the nation"

You, and your team, should be the authority on data, and it's associated usage, meaning and quality, within your organisation. You should be able to guide the business in maximising the benefit from data, including ensuring that they are using the correct data sources, and the correct tools for the information exploitation they are performing. The business users should be aware of who to speak to if they have issues/problems, and communication channels should be open, with regular two-way communication undertaken. Listening to the business is essential.

"and controls its policy"

You, and your team, should ensure that all policy is documented, and adhered to. This will include policy relating, but not limited, to such things as:
  • Security & Access
  • Reporting Tool Usage
  • Regulatory Compliance
In addition to creating & maintaining policy, you should also ensure that the business community is aware of policy, through training sessions or communication bulletins.

For example, to ensure that security & access control policy is adhered to, and that the business has access to the data they require, you should communicate the policy and process surrounding systems access effectively. I have seen examples of people sharing system access logins as they either did not understand security & access policy, or the process to request system access was complicated, or slow. Efficient process surrounding policy will aid adherence.

Or similarly, how can you ensure that your business community adhere to Data Protection legislation if you do not make your community aware of what the legislation involves?

"knows and employs the means of securing and promoting the interests of his country"

Two methods of securing and promoting interest that instantly came to mind are:

1. Aligning actions to Business Strategy

There is a common consensus that a single point of contact for business reporting is a good thing, and many people are aware of the associated benefits - however, a successful BI leader is able to translate and align these benefits to Business Strategy.
  • How does X fit into and support the overall business strategy?
  • How can Y help us achieve our strategic business goals?
  • Is Z most likely to assist in achieving a strategic business objective?
2. Having a solid Communications Strategy

Securing the support of the business is essential in your success as a BI leader. If the business have no confidence in your ability to deliver information to the right people, in the right place, in the right format, and at the right time, they will not support you. Having a solid communications strategy will aid in gaining, and sustaining business support.

Through effective communication you can ensure that the business are aware of any issues impacting your service. You can also provide measurement scorecards that allow the business to benchmark where we are against where we've been, alongside a view of where we're going. This strategic visibility, alongside a forum for business users to express their opinions, praise or concerns will aid in keeping the business supportive of your goals & objectives.

In closing, there is another Cicero quote that I feel should be shared here:

"No leader, either in war or in peace, could ever have performed important or beneficial actions unless he had gained the cooperation of his fellow men."

Build a great team around you, with proactive, knowledgeable and approachable people who share in your vision.

Finally, know, and remember, your audience - the business community - involve them in your actions, ensure you have their support, and continually communicate with them. Only then will you be able to maximise the benefit and performance of your objectives.

Monday, 5 April 2010

El Festival del IDQ Bloggers: April 2010

El Festival del IDQ Bloggers AKA The blog carnival for Information/Data Quality Bloggers is a monthly event devised by IAIDQ which compiles great data quality related blog posts. Each month the festival is hosted by a different data quality blogger, and I am privileged to host the festival during April. Lets get on with the show, and introduce the names behind the blog posts..

Henrik Liliendahl Sørensen

As well as making extremely frequent posts on his blog, Henrik is a Data Quality and Master Data Management Professional who is also engaged within Data Architecture solutions. He currently works for Omikron Data Quality. Click over to his blog for more information.

What better place to start in this edition of the Blog Carnival than this thought-provoking post posing the question of "What is Data Quality Anyway?". Like so many of Henrik's recent posts, this one leads to an interesting discussion within the comments.

Jim Harris

An Independent consultant, speaker, writer and blogger with over 15 years of professional services and application development experience in multiple data quality related disciplines, Jim is a prolific contributor to the Data Quality community. Ever innovative in his blog posts, from his use of song lyrics, to his use of Shakespeare - not to be missed.

The Circle of Quality sees Jim guiding us through his thoughts on the interconnected business context continuum and the associated challenges with measuring quality throughout the cycle.

Oughtn't you Audit, hosted on the DataFlux blog, asks why more organisations are not fully auditing their data on a daily basis, and questions the all or nothing approach that is often seen.

Daragh O'Brien

The first person I met during my visit to the DMIQ conference in 2007, Daragh maintains multiple blogs, including DoBlog, which has been serving up posts centered on the fun side of Information Quality Management from an insiders perspective since 2006. He is also the founder of Castlebridge Associates, a leading niche Information Asset Management consulting company based in Castlebridge, Co. Wexford.

His St Patricks Day Special hosts an interesting photograph and uses it as a metaphor for information quality.

Sometimes it is the simplest things tells a humorous tale of how a home improvement DIY task led to darkness, all for the want of a piece of metadata.

In his company blog, Obscured by Clouds discusses the need to avoid having our vision of what it actually means to manage the quality and privacy of information obscured by a goal rush mentality around Cloud Computing.

Dylan Jones

Aside from being the 1st person within the DQ Internet community that I spoke to, Dylan Jones is the editor of the fantastic online community resource - Data Quality Pro, which is dedicated to helping data quality professionals take their career or business to the next level. He is a prolific author on the subject of data quality and related disciplines.

Does Your Project Suffer From Data Quality Product Myopia? looks at the pitfalls of centering your Data Quality efforts around a technology toolset, and offers some advice to aid the cure of this common condition.

Ken O'Connor

Having been described as a "Grizzled Veteran", and with almost 30 years experience across the full development lifecycle, it's not hard to understand why. Ken started his blog to share his experience and to learn from the experience of others. More information about Ken is available on LinkedIn.

Applying Lateral Thinking to Data Quality takes inspiration from Edward De Bono and asks how applying an example of De Bono's work can help improve Information Quality.

Rich Murnane

A self-described "data geek" on his twitter account, Rich has been running his blog, and sharing great posts for 5 years. He specialises in Oracle, Data Architecture & Data Quality.

De-Duplication of names using DataFlux is a St Patricks Day special, where Rich decided to take a stab at de-duplicating a list of "Patrick" names, and sharing his experiments with de-duplication using the DataFlux toolset.

Wednesday, 10 March 2010

Data Quality Principles within the PMO

According to Wikipedia, the Project Management Office (PMO) is:

"the department or group that defines and maintains the standards of process, generally related to project management, within the organization. The PMO strives to standardize and introduce economies of repetition in the execution of projects. The PMO is the source of documentation, guidance and metrics on the practice of project management and execution."

It then goes on to state that:

"A good PMO will base project management principles on accepted, industry standard methodologies such as PMBOK or PRINCE2. Increasingly influential industry certification programs such as ISO9000 and the Malcolm Baldrige National Quality Award (MBNQA) as well as government regulatory requirements such as Sarbanes-Oxley have propelled organizations to standardize processes. Organizations around the globe are defining, borrowing and collecting best practices in process and project management and are increasingly assigning the PMO to exert overall influence and evolution of thought to continual organizational improvement."

With this in mind it would seem sensible to suggest that a number of Data Quality Principles should be documented, and executed within the PMO space.

However, within some organisations, I have witnessed the PMO disregard fundamental Data Quality principles, that, if applied at the start of a project would have saved time and money spent on 'scrap & rework' or downstream effort to implement these principles once a project has gone live.

I'm a DQ Professional - What can I do?

You can define principles of data quality and position yourself as a trusted advisor to the PMO, consulting within each new project that either consumes, or produces, data/information. Your role will become more about prevention, improving confidence in data and ensuring that the correct governance is in place for each new project.

It sounds time consuming

But it doesn't have to be. By defining a number of principles that must be met to ensure a project can successfully be signed off, we can create guidelines that if followed can aid project success. Data quality is all too often an after thought, or a one-off profiling exercise that will, time-permitting, be undertaken during the early stages of a project. We need to change this mindset, and embed a culture of data assurance & governance within each new project.

With this in mind, here are some principles to consider:

Principle #1 - Data Quality Assurance

Any project utilising data should ensure that it's source data has been profiled and analysed, with any potential data quality issues identified. Actions to resolve issues should be documented, alongside any perceived risks with the current/future state of data.

Principle #2 - Data Lineage

Documenting data lineage will increase user confidence in data, and aid with the satisfaction of regulatory compliance. The items that should be documented within our Data Lineage exercise include:

Business Rules
Data Transfer Methods
Aggregation Rules
Load Schedules

Principle #3 - Data Dictionary

Both technical, and business metadata should be captured within the project scope. This again will increase confidence, and knowledge of data, data structures and information produced within the project.

Principle #4 - Reference Data Management

Any reference data utilised within a project should be identified and centrally managed. The process for maintaining reference data should be documented, and if necessary, should be associated with 'data stewards' responsible for it's maintenance.

Principle #5 - Data Classification

Data Classification was something I touched on previously as a method to ensure that reports were efficiently stored. It is also a critical exercise to undertake in terms of Security & Access Control. Any reporting outputs that will be created within a project should be classified accordingly.

Principle #6 - DQ Team Engagement

To ensure adherence and satisfaction, the Data Quality team should be involved in the project sign-off process. Only when data quality/governance concerns have been satisfied should the project be signed off.

In Summary

In many organisations projects are often silo'd in approach, with the scope of data quality/governance effort applied to each new project varying significantly, depending upon user requirements, time, and other factors. By involving the Data Quality team within each project we can start to build standards, add value and increase data confidence in new project implementation. While at the same time increasing business awareness of our function, confidence in our service and centralisation of our improvement efforts.

Thursday, 25 February 2010

The First Step on your Data Quality Roadmap

You may be about to embark on an exciting Data Quality initiative, full of enthusiasm and armed with the belief that you can change organisational culture and save the business from poor data quality.

In order to effectively roll out policies and procedures you may be brainstorming, and then formalising, a Data Quality Roadmap.

What is a Data Quality Roadmap?

It's a strategic enabler.

It allows you to combine your short-term and long-term goals.

It's a framework for progress.

It helps you to co-ordinate people, process and technology, and enables you to communicate where you are, and where you're heading, in a digestible and measurable way.

Sounds great, where should I start?

As George Santayana once wrote:

"Those who are unaware of history are destined to repeat it"

With this quote in mind, I would propose that the first step on any Data Quality Roadmap would be to understand what DQ improvement/management initiatives have previously been undertaken within your organisation.
  • Who was responsible for previous initiatives?
  • What processes & procedures have previously been implemented?
  • Where did they succeed or fail?
  • When did previous initiatives take place?
  • Why did they succeed or fail?
  • How were they received by the business?
Learn lessons from what has happened before, and use this historical analysis as a basis to implement strategic changes within how Data Quality is tackled in the future.

How will this exercise aid the future?

On your roadmap there may be an item such as "work with the business to create a common dictionary". Such an item may cause someone in the business to state: "We tried this before, and it didn't work".

Using the information gathered from our historical analysis of previous DQ initiatives, we can attempt to get to the root cause of why it didn't work. We can work with the business, gather their opinions, and move forward to creating a solution that does work. A solution which increases business confidence and allows us to achieve our strategic goals.

Thursday, 18 February 2010

Can motivations impact the state of data quality?

I recently finished reading a fascinating book about motivation, called Drive, written by Daniel H. Pink. The book guides us through traditional motivators that drive performance and goes on to discuss a paradigm shift occurring within factors that motivate us in life, and in the workplace.

The book got me thinking about data quality professionals and the things that motivate us within our work.

Lets look at an example

Within our organisation we have a team of enthusiastic, passionate data quality professionals.

Lets look at possible motivations within their roles:

Intrinsic Motivators
  • They are passionate about Data Quality
  • They want to rid the organisation of poor quality data
  • They strive to be the data experts within their organisation
Extrinsic Motivators
  • They need to hit objectives set by managers (linked to annual bonus payment)
  • They need to ensure the organisation adheres to regulatory compliance
  • They have a pressure to deliver ROI to the organisation
Based on the motivators above, lets look at two differing scenarios:

Scenario #1

The DQ team is given autonomous reign within the organisation. They listen to the business, focus on their key issues, solve a number of data quality issues and become trusted advisors. The Intrinsic motivators of the team members are completely satisfied, and the team is heralded as a success by the business community, therefore demonstrating ROI.

Scenario #2

The DQ team, who report directly to the CFO, are engaged in a CFO sponsored program to become financial data experts and to ensure that the organisation achieve regulatory compliance. This long-term program was initiated due to a number of data quality issues that risked compliance failure. The DQ team have been supplied with a checklist that, once adhered to, will ensure that the organisation is fully-compliant and the team will be deemed a success.

So what about the motivations?

In both scenarios the team are satisfying both intrinsic and extrinsic motivators. Scenario #1 saw the DQ team particularly satisfy their intrinsic motivators, where as in scenario #2, their extrinsic motivators were challenged to ensure that compliance was met.

How do you feel motivating factors impact upon the performance of DQ activity within an organisation?

DQ as a checklist

Within scenario #2 it was noted that the DQ team work to ensure that a checklist is satisfied, which will in turn ensure that the organisation achieves compliance.

By placing emphasis on the pressures of successfully completing a checklist, do we risk a negative impact on the performance of DQ efforts within an organisation?

Yes, completion of the checklist in the example above would ensure regulatory compliance, but will it come at the cost of poor DQ, and the potential of cutting corners within areas that do not have to adhere to compliance? Let me know your thoughts.

Sunday, 14 February 2010

A balanced approach to scoring data quality: Part 6

I wanted to spend a little bit of time concluding this series by discussing how we could visualise the example metrics that were discussed in the past few blog posts.


Wikipedia defines a Dashboard as "an executive information system user interface that (similar to an automobile's dashboard) is designed to be easy to read". This is exactly what we want to achieve when creating a dashboard to present our DQ metrics.

The below diagram shows an example layout for the dashboard:

The 'summary, actions & contact information' section is important, and one which we haven't previously discussed. This section should consist of commentary to allow for further context to be applied to what is contained within the four metric sections of the scorecard. A summary of what has happened in the past month/quarter (since the last scorecard publication), alongside a summary of DQ Management actions to be undertaken in the coming weeks/months. Contact information should always be included to aid ease of further assistance or questions.

Remember the Metrics?

Within the previous posts in the series we looked at a number of example metrics which could be reported upon a DQ scorecard.

Now lets look at an example of how we could bring these metrics together onto our dashboard.

We can even drill down into aspects of the scorecard. For example, each metric within the 'Customer' Section could have an option for the viewer of the dashboard to drill down to gain further insight into that particular metric. The same functionality could exist in the 'Data Dimensions' section. At a high level, we can show a RAG status for each individual business department, or process, and from there we could allow the viewer of the dashboard to drill down in order to ascertain how that RAG status was derived, and which dimensions require particular improvement, or monitoring.

In Conclusion

The balanced approach to scoring data quality that has been discussed within this series can be used as both a vehicle to promote continuous improvement and as an effective performance management tool. I'd be keen to hear your stories of implementing DQ scorecards - successes, failures, lessons learned - so please leave a comment within the series, or contact me directly.

Related Posts

Part 1: Introduction

Part 2: Customer

Part 3: Internal Processes

Part 4: Financial

Part 5: Data Dimensions

Tuesday, 9 February 2010

A balanced approach to scoring data quality: Part 5

We've previously introduced the scorecard and discussed the 'customer', 'internal processes' and 'financial' perspectives. Now we're going to take a look at the final section of the scorecard, which deals with 'data dimensions'.

The sole reason I held this section back - preferring first to focus on other sections of the scorecard - is that this is the section you will be most familiar with. Profiling exercises have been the de facto standard for measuring data quality for many years. There are a number of tools on the market that will profile data in great detail. If you don't have the budget to buy a tool you could even build your own - Dylan Jones and Rich Murnane give us some great examples - which will allow you to profile your data against a multitude of dimensions of data quality.

What are dimensions of data quality?

When we talk about dimensions of data quality we are talking about aspects of quality such as accuracy, consistency, duplication, referential integrity and confirmity. Some basic practical examples could include:
  • Table 'SALES' should contain only 1 record per 'TRANSACTION_ID'.
  • Field 'AMOUNT' in 'SALES' table should always contain a numeric value
  • Field 'CUSTOMER_NAME' should never be NULL
  • Table 'CUSTOMER' should contain a corresponding record in our 'CUSTOMER_ADDRESSES' table
  • Field 'Post Code' in 'CUSTOMER_ADDRESSES' table adheres to Royal Mail PAF Standards

By profiling data against our data quality dimensions we can ascertain the number of records that are deemed OK, against those that are deemed to contain potential issues to review. However, in order to maximise the benefit of a data profiling exercise we should look towards answering the following question:

How well are our data quality dimensions performing when aligned to business rules and expectations?

Why Align to business rules and expectations?

Primarily because it allows us to add context to data. Context is important due to the fact that it will determine how data is translated by the business, and ultimately, how the data is used. What is deemed to be 'Good Quality Data' may differ between two business users, or two different business departments.

This is because each user/department may use data in a different way, and therefore have different requirements of what is essential to ensure that data is fit for the purpose it is intended.

How to ascertain?

In order to understand rules and expectations that should be placed upon data to ensure that it is of good, and fit, quality, we will need to speak to the business. If you have a data steward network, or have previously identified who the Subject matter experts are within the business, ask them. If you haven't already previously identified the key stakeholders and subject matter experts for business data, this would be a great time to kick start that exercise, and reap instance benefits.
  • Ask the business for their critical data items.
  • Which data items do they need to fulfill their responsibilities?
  • What should these data items look like?
  • Which rules should be applied?
  • When should data be available?

A practical example

The Sales Reporting team provide sales information to the executive team on a weekly basis. Their reporting pack consists of reports such as:
  • Total Volume/Value of Sales
  • Sales Per Store
  • Sales Per Product
  • Top/Bottom Performing Sales Staff

A number of critical data items were identified in order for this reporting pack to be built upon good quality data. Based upon these critical data items we identified a number of rules and expectations that must be met during a data profiling exercise in order for Sales data to be deemed fit for purpose.
  • The sales transactions table should only contain 1 record per 'transaction_id'
  • The sales transactions table should contain data for the previous six months, up to and including the previous working day
  • Each sale must be related to a 'store_id' and 'seller_id' that can be referenced back to our 'stores' and 'sellers' reference data.
  • The 'sales_amount' should never be a negative figure
  • A seller should only ever be attached to one store at a time. If a seller moves to a new store the previous seller/store relationship should be end dated.

If a record adheres to the above rules we can flag it as ‘GREEN’. Meaning that it adheres to business rules and expectations on data performance. Once the recordset has been aggregated we can measure % records that are of ‘Good Quality’ for Sales Reporting and visualise this upon our scorecard.

In Conclusion

The key takeaway point from this section of the scorecard is that we are attempting to benchmark data quality against business expectations. Profiling data quality without taking into account the context in which the data will be used may result in further data quality issues, and misuse of data.

In my next (and final) post of this series we'll go through an example of putting all the sections together into a usable dashboard for the business.

Related Posts

Part 1: Introduction

Part 2: Customer

Part 3: Internal Processes

Part 4: Financial

Part 6: The Dashboard

Thursday, 4 February 2010

A balanced approach to scoring data quality: Part 4

We've determined that the scorecard is all about continuous improvement. We've discussed measuring the customer perception, as well as measuring our performance against our internal processes. Today I wanted to cover the 'Financial' perspective of the scorecard, which will allow us to begin to demonstrate the Return of Investment from Data Quality Management.

Data Quality has often come under the spotlight as being a difficult discipline in which to demonstrate quantifiable financial value. Many people have had issues gaining funding for DQ initiatives because of the purely speculative nature of the ROI. The aim here to discuss ways in which we can go beyond this speculation, and look to financially quantify data quality within an organisation.

Quantify This

Our Internal Processes are often hard to quantify in financial terms. For instance, a large amount of time and effort has been applied to ensure that the business community has a definitive business glossary, containing all the terminology and business rules that they use within their reporting and business processes. This has been published, and highly praised, throughout the organisation.

However, how can we begin to ascertain the true financial benefit of this activity? To do this we would need to interview the whole business community, asking them to cast their collective minds back to the world before the business glossary. Asking them:
  • How long they have spent chasing the correct definition?
  • How many reports they generated with incorrect definition?
  • How much scrap and rework they undertook because of this incorrect definition?
It is generally accepted that an agreed and published business glossary is beneficial to the community, but putting financial benefit against it can be extremely difficult.

Quantify That

Two key metrics I want to delve into are:
  • Known cost of poor data quality
  • Known saving due to DQ Management
Both of these metrics are useful in financially quantifying Data Quality.

Known cost of poor data quality

I like to think of the 'known cost of poor data quality' as a reactive metric. What I mean by this is that the cost of poor data quality can only truly be ascertained after an issue has occurred. If an issue has not yet occurred, the cost can only be pure speculation. As part of our reaction to a data quality issue we should undertake an impact assessment.

This impact assessment will ask, among other things:
  • How long has the issue been a problem?
  • Who has it impacted during that time?
  • What workarounds were undertaken?
The answers to these questions will aid us in ascertaining a known cost of the issue.

A example of the cost

We discussed Business definitions in the previous section, so let's now take an example from another one of our Internal Processes: Data Quality Issue Resolution. The below issue was raised to a newly established Data Quality team by an MI analyst within a financial services organisation:

"We have an Issue with Product Names within our datamart. A large number of records are coming through with incorrect or empty product names, which is causing havoc with my reporting. The product code is correct, but the people I send my reporting pack to won't understand the code. I'm currently bringing the data from the datamart into an Access database, and joining the table to my lookup table that contains all the correct product codes/names. This is taking me about an hour a day because of the amount of data I have to import/export. It's been like this for 4 months but I didn't know who to contact about the issue. Thanks for your help!"

This issue was promptly resolved by ensuring that reference data was updated to reflect true products. But how can we begin to add a financial perspective to this issue?

Using ITJobsWatch, the British IT Jobs market tracker, I noticed that the average salary of an MI Analyst was £32,500. Based upon this salary I estimated the cost of an hour a day each working day for 4 months.

Monthly: £2,708
Weekly: £625 (based on 4.33 weeks in a month)
Daily: £125 (based on 5 working days per week)
Hourly: £16.66 (based on 7.5 hour working day)

4 months at 4.33 weeks = 17.32 weeks
5 days per week for 17.32 weeks = 86.6 days
1 hour a day for 86.6 days at £16.66 per hour = £1,442.75

In this example, the cost was caused by time & resource spending 86.6 hours firefighting instead of partaking in value-adding activity. If the MI Analyst hadn't reported the issue, and continued to firefight for 1 year, they would have spent 259.8 hours, or more than six working weeks over the course of a year firefighting. A scary thought.

Known saving due to DQ Management

'Known saving due to DQ Management' is on the other hand a proactive metric for quantifying financial benefit. It is the measurement of savings garnered due to DQ management efforts to ensure issues are captured prior to become issues.

Caution should be taken to ensure that speculative savings are not mistaken for known savings.

For instance, you could speculate that because a data quality issue relating to customer address details - that would have impacted marketing, billing, customer care and customer complaints - was fixed prior to impacting customer mailouts, you saved the organisation £5,000,000. Not to mention the potential bad publicity and customer churn. What qualifies you to suggest this monetary value?

The best way to ensure that this metric relates to true savings is to ensure that DQ management efforts are closely aligned to business processes.

A example of the saving

For instance, the Data Quality Firewall initiative that I wrote about previously discovered that a UK retail bank were about to overpay their staff by £200,000 in sales incentive payments due to duplicate sales transactions in their processing tables. The initiative resulted in these duplicate records being captured prior to Incentive payments being calculated. Our DQ initiative saved the organisation £200,000 by performing one simple data profiling technique (FACT). Not to mention the savings due to scrap & rework and potential trade union/media involvement that a potential mistake and subsequent clawing back of employees take-home pay could invoke (SPECULATION).

In Conclusion

When looking at measuring Data Quality from a financial perspective it is important to look at it from the perspective of both the 'known cost of poor data quality' and the 'known saving due to DQ Management'.

We know, and accept, that it may not be possible to truly quantify all aspects of data quality management. However, by starting to quantify data quality in terms of costs & savings wherever you can will help to raise the profile of both your data quality management activities, and the need for fit for purpose data within your organisation.

Related Posts

Part 1: Introduction
Part 2: Customer
Part 3: Internal Processes
Part 5: Data Dimensions
Part 6: The Dashboard

Monday, 1 February 2010

A balanced approach to scoring data quality: Part 3

Today I wanted to discuss the ‘Internal Processes’ section of the scorecard. In case you missed the previous parts of the series, you can read the introduction here, and about the ‘customer’ section here.

Perform, Perform, Perform

Throughout the business world people are measured upon their performance. How well do they carry out their responsibilities? Do they hit their objectives? Do they adhere to any applicable SLAs? The Data Quality team should be no different and we should look towards measuring our performance against our internal processes.

Our Internal processes are the procedures and tasks we follow to ensure data quality is managed, and communicated, throughout the business community.

Consider the following Internal Processes:

  • Publishing and Review of a Business Terminology / Data Dictionary
  • Resolution and Communication of DQ issues in a timely manner
  • Identification of appropriate system, data & report ownership

All of the above are critical processes within the day to day responsibilities of a Data Quality team. If we under-perform in delivering any of these processes, it will have a knock-on impact on how data quality management is delivered within an organisation. In some cases, poor performance within our internal processes could even be a contributing factor to poor data quality.

For example, a Product Manager has noticed that sales data for their product is not accurate in the data warehouse. They raised a data quality issue to your team. The data warehouse is also used by the Finance team, and is currently being used to provide financial figures for a last minute board meeting. The data quality issue was raised yesterday, and is currently being investigated, but there has been no communication to the business community to advise them of the issue. The board of directors are now looking at inaccurate data, questioning the figures and wondering whether they can trust the data or not?

How can we measure our Internal Processes?

We can measure the performance of our internal processes by benchmarking them against our objectives, or against targets based upon our objectives. As an example, let’s take the process of ‘Resolution and Communication of DQ issues in a timely manner’.


All known Data Quality issues should be immediately communicated to the business community, and be resolved within 3 days of being raised


DQ Issues raised – 125
DQ Issues resolved within 3 days – 70 (56%)
Issues communicated to community – 100 (80%)

Upon seeing the measures above, we could ask:

“Why were only 56% of DQ issues resolved within our target time period? Do we need to involve more resources to fix issues? Do we need to adjust the target SLA?”


“All issues were due to be communicated to the business community immediately. Why were 25 issues not communicated? Do we need to set up reminders? Was no one able to pick up the issues?”

In Conclusion

As Satesh suggested in a comment to my previous post: “What gets measured improves”. This is exactly what we are trying to achieve from a scorecard. Poor Performance within our Internal Processes could have a knock on effect on the perception of DQ management from our customers. Therefore, a process of continuous measurement, analysis and improvement is required, in order to ensure that we do not get complacent and adopt poor DQ Management habits.

The next post in this series will deal with the ‘Financial’ section of the scorecard, and we’ll look into how we can begin to measure the financial impact that DQ management can have on an organisation.

Related Posts

Part 1: Introduction
Part 2: Customer
Part 4: Financial
Part 5: Data Dimensions
Part 6: The Dashboard

Friday, 29 January 2010

A balanced approach to scoring data quality: Part 2

In a previous post I introduced the need for a balanced approach to scoring Data Quality Management. Today I wanted to spend some time talking about the 'Customer' section of the scorecard.

The aim of this section is to discuss methods for allowing us to measure how we are perceived by our customers.

Perception Matters

Whether you think of yourself as one or not, we are service providers. We provide a service to the business community. We serve the business by ensuring data remains at a level of quality that is fit for its purpose. We ensure that data is correctly defined, owned, and managed. If the business community have issues, we strive to resolve these issues in an efficient & timely manner.

Perception is a key driver in the way that the business community reacts to your service. If you are perceived as knowledgeable & helpful, the business community will be keen to utilise your services. If you are perceived as slow & unapproachable, will the business community use your service, or will they look to source the answers/advice they need from elsewhere?

During times when companies are cutting costs across the board, do you want to be perceived as a cost to the business, or an asset to the business community?

We need to define objectives and targets

Before we can begin to measure the perception of our services within the business community, we need to ensure that we have defined objectives.

The key questions to ask yourself are:

What do we want to achieve?
What does success look like?

Some example objectives relating to customer perception could be:
  • The DQ team are seen as data specialists
  • The Business Community are satisfied with the resolution of Data Quality Issues
  • Data Quality issues/queries are resolved in a timely manner
  • The Business Community would recommend services of the DQ team to their colleagues

Setting ourselves timely performance targets against these objectives allows us to measure where we are at against where we want to be.

For instance, in Q1 2010 we may be a small team, so our target would be to ensure 60% of the business community felt that issues were resolved in a timely manner. However, in Q2 2010 we plan to expand, and due to extra resource our target would be to ensure 70% felt that their issues were resolved in a timely manner.

The setting of Objectives and Targets should not be a one-time exercise. Continual review will allow us to consistently strive to provide a better service.

How can we measure perception?

In order to measure the perception of our services within the business community, we need to capture their feedback, and measure this feedback against our pre-defined objectives.

There are a number of ways in which we could capture feedback, including:

1. Satisfaction Surveys

When a member of the business community uses the DQ service, you should send them a ‘Satisfaction Survey’ and gather their feedback.

If you add up the scores 1-5 (V Dissatisfied to V Satisfied) for all participants and divide by number of participants we can get a ‘business community’ score to measure against each objective. If we capture information such as ‘department’ of surveyed employee we could also measure satisfaction at a departmental level. Are we serving one part of the business better than another part? Why is this? Better knowledge of their data? Better relationships?

2. Interviews

During interviews you can take a similar approach to the survey method discussed above. The Interview method is a more personable approach than surveys and may allow for further detailed information to be extracted. For instance, the interviewee may be very dissatisfied, so we can utilise the one-to-one time to get to the root cause of this dissatisfaction.

3. Was this useful?

If you store documentation, or business definitions on an intranet environment – like a Wiki – you may wish to include a control on the webpage to gain customer feedback.

Was this useful? yes/no

But I have problems gaining feedback

If you have problems gaining feedback from the business, you need to find ways to encourage communication. Rewarding feedback with something like "A free coffee to anyone who fills in this survey" really is a great way to generate more responses to surveys.

In Conclusion

Measuring the perception of the customer is a critical part of scoring DQ Management. We all strive for continuous improvement and a key measure to aid continuous improvement is customer perception. If the business community start off happy, but after 6 months they are dissatisfied with DQ efforts within the organisation – we need to be aware, and react to this. Without the buy-in from the customer, where would we be?

In the next few posts we will go on to discuss the other sections of the scorecard before we look at how it all fits together.

Related Posts

Part 1: Introduction

Part 3: Internal Processes

Part 4: Financial

Part 5: Data Dimensions

Part 6: The Dashboard