Friday, 29 January 2010

A balanced approach to scoring data quality: Part 2

In a previous post I introduced the need for a balanced approach to scoring Data Quality Management. Today I wanted to spend some time talking about the 'Customer' section of the scorecard.

The aim of this section is to discuss methods for allowing us to measure how we are perceived by our customers.

Perception Matters

Whether you think of yourself as one or not, we are service providers. We provide a service to the business community. We serve the business by ensuring data remains at a level of quality that is fit for its purpose. We ensure that data is correctly defined, owned, and managed. If the business community have issues, we strive to resolve these issues in an efficient & timely manner.

Perception is a key driver in the way that the business community reacts to your service. If you are perceived as knowledgeable & helpful, the business community will be keen to utilise your services. If you are perceived as slow & unapproachable, will the business community use your service, or will they look to source the answers/advice they need from elsewhere?

During times when companies are cutting costs across the board, do you want to be perceived as a cost to the business, or an asset to the business community?

We need to define objectives and targets

Before we can begin to measure the perception of our services within the business community, we need to ensure that we have defined objectives.

The key questions to ask yourself are:

What do we want to achieve?
What does success look like?

Some example objectives relating to customer perception could be:
  • The DQ team are seen as data specialists
  • The Business Community are satisfied with the resolution of Data Quality Issues
  • Data Quality issues/queries are resolved in a timely manner
  • The Business Community would recommend services of the DQ team to their colleagues

Setting ourselves timely performance targets against these objectives allows us to measure where we are at against where we want to be.

For instance, in Q1 2010 we may be a small team, so our target would be to ensure 60% of the business community felt that issues were resolved in a timely manner. However, in Q2 2010 we plan to expand, and due to extra resource our target would be to ensure 70% felt that their issues were resolved in a timely manner.

The setting of Objectives and Targets should not be a one-time exercise. Continual review will allow us to consistently strive to provide a better service.

How can we measure perception?

In order to measure the perception of our services within the business community, we need to capture their feedback, and measure this feedback against our pre-defined objectives.

There are a number of ways in which we could capture feedback, including:

1. Satisfaction Surveys

When a member of the business community uses the DQ service, you should send them a ‘Satisfaction Survey’ and gather their feedback.

If you add up the scores 1-5 (V Dissatisfied to V Satisfied) for all participants and divide by number of participants we can get a ‘business community’ score to measure against each objective. If we capture information such as ‘department’ of surveyed employee we could also measure satisfaction at a departmental level. Are we serving one part of the business better than another part? Why is this? Better knowledge of their data? Better relationships?

2. Interviews

During interviews you can take a similar approach to the survey method discussed above. The Interview method is a more personable approach than surveys and may allow for further detailed information to be extracted. For instance, the interviewee may be very dissatisfied, so we can utilise the one-to-one time to get to the root cause of this dissatisfaction.

3. Was this useful?

If you store documentation, or business definitions on an intranet environment – like a Wiki – you may wish to include a control on the webpage to gain customer feedback.

Was this useful? yes/no

But I have problems gaining feedback

If you have problems gaining feedback from the business, you need to find ways to encourage communication. Rewarding feedback with something like "A free coffee to anyone who fills in this survey" really is a great way to generate more responses to surveys.

In Conclusion

Measuring the perception of the customer is a critical part of scoring DQ Management. We all strive for continuous improvement and a key measure to aid continuous improvement is customer perception. If the business community start off happy, but after 6 months they are dissatisfied with DQ efforts within the organisation – we need to be aware, and react to this. Without the buy-in from the customer, where would we be?

In the next few posts we will go on to discuss the other sections of the scorecard before we look at how it all fits together.

Related Posts

Part 1: Introduction

Part 3: Internal Processes

Part 4: Financial

Part 5: Data Dimensions

Part 6: The Dashboard

Wednesday, 27 January 2010

A balanced approach to scoring data quality

Traditionally, a Data Quality dashboard would attempt to measure the quality of data against a number of data quality dimensions, such as:

Completeness, Accuracy, Consistency, Uniqueness, Conformity, Timeliness

Depending on how data performed against each dimension it would be given a % score, and typically be displayed on a intuitively designed dashboard with a Red/Amber/Green status.

As a data quality practitioner I generally found the information displayed on these dashboards useful. It allowed me to target into areas where dimensions were below a desired level, and undertake further analysis into the root causes.

Problems with this approach

Looking at completeness of our datasets is great, but is it really important to us that ‘Address_Line_2’ in ‘CustomerAddresses’ is only 47% complete?

Zip code is 76% accurate but how do we measure accuracy? 90210 is an accurate zip code, but I live in London, not Beverly Hills.

It’s all about the context

In his 1890 book, Principles of Economics, Alfred Marshall commented that “In common use almost every word has many shades of meaning, and therefore needs to be interpreted by the context.“

Applying context to a data quality measurement allows us to assess the impact of data quality in association with a business purpose. This allows us to incorporate business-relevant metrics into our DQ scoring dashboard.

For Instance, Marketing require Tel_No, Address_1, Town, Country, Postcode & E-Mail to be populated. The postcode should adhere to Royal Mail standards if a UK postcode, and should be blank if Country = Ireland. The E-mail should consist of a valid network address.

If a record adheres to the above rules we can flag it as ‘GREEN’, and once the recordset has been aggregated we can measure % records that are of ‘Good Quality’ for Marketing.

But there are limitations

The above methods have consistently been implemented very well by many of the large players in the Data Quality market. However, I still don’t feel completely satisfied by what the dashboard is communicating.

Yes, technically, the data can be measured, and with the context of business rules applied we can measure whether the data is fit for the intended business purpose. But I still feel that we’re only touching the surface of measuring data quality within an organisation.

A Balanced approach

I propose that we have matured beyond this traditional method of measuring data quality. In order to fully measure the impact of Data Quality Management initiatives within an organisation we need to look above and beyond data dimensions, and start to also look towards:
  • the organisations (our customers) perception of data quality efforts
  • the understanding of data quality management within the organisation
  • the cost benefit we are delivering to the organisation.
Only when we understand all of these things can we measure the true impact of Data Quality Management.

You may, or may not have heard the term ‘Balanced Scorecard’ before. states that a Balanced Scorecard “was originated by Drs. Robert Kaplan (Harvard Business School) and David Norton as a performance measurement framework that added strategic non-financial performance measures to traditional financial metrics to give managers and executives a more 'balanced' view of organizational performance.”

An adapted balanced scorecard approach, such as the diagram below, would suit the needs of achieving a balanced view of scoring Data Quality.

The four sections that make up the scorecard are equally important to us in allowing the performance of Data Quality Management to be measured.

Over the coming weeks I want to delve deeper into this subject, taking each section of the scorecard and discussing key metrics that we can assess, and how we can then slot all the sections together to build a balanced scorecard within your organisation.

Monday, 25 January 2010

A DQ Firewall to protect you and your sales force

I see a Data Quality firewall as a tactical approach to Data Quality, typically driven by a departmental need to ensure that data is fit for the purpose that it is intended. A good practical example would be within a Commissions department. The Commissions department is responsible for payment of commissions to the sales force, perhaps on a monthly basis, dictated by rules relating to sales and employee eligibility.

I.e. An employee should be employed full-time, have made more than 50 sales in the current month, and have had a high level of customer satisfaction feedback.

If an employee is eligible, a payment will be calculated based upon which products they have sold. Some products will pay higher commission than others, and the total amount owed will be paid during the monthly payment run, as per their salary.

Why the need for a firewall?

As you can see from the above example, we have the need for data that is:

1. Timely (we have received all data when we expected to receive it)
2. Complete (data contains everything needed to calculate a commissions payment)
3. Accurate (data represents true values)

These Data Quality dimensions must be met to expectation in order to ensure that all commission payments for employees are correct.

Legal Action

An article found on the Information Week website in December 2008 stated that Sprint, the wireless telecoms carrier is facing a class-action lawsuit over allegations that it shafted employees of commissions totaling more than $5 million.

It turns out that employees from their retail stores were not receiving full commission on products sold. This was due to failed integration between Sprint & Nexus systems following a merger of the two companies in 2005, 3 years previous.

Practical example of a DQ Firewall

The Sprint issue (and impending court case) could have been avoided if an automated data quality firewall had been implemented around the system used to calculate commission payments. The below diagram helps to explain how the process could operate:

(Larger diagram:

Source data may contain data such as:

  • Sales (items actually sold by employees)
  • Mystery Shopper scores (a 3rd party company rating service/sales process)
  • Referrals (any referrals made by sales staff)
  • Customer Satisfaction (survey scores from customer satisfaction)

Firstly, data would typically be loaded into some form of data warehouse. The system used to calculate commissions would then take this information into a staging area, where any required manipulation/aggregation would occur. Once staging tables have been populated, the data would be loaded into the database that is used to calculate commission payments. Commission Payments would then be derived from this data and the business rules applied during the process.

You will notice at the process contains 3 checkpoints where we ask the question:

“Is data fit?”

Checkpoint 1: Have we received all the data that we expected?

And, is it in the format we expect, containing accurate values?

For example, a Commissions analyst would typically know which products we pay commission upon. Therefore we should monitor data received per product. Implementing Lower & Upper control limits based upon expectations would be a good way to add a RAG status to each product.
  • Why have we received 0 sales data for this product?
  • Historically we have received between 10,000-15,000 sales of this product per month, however, this month we only have data for 500 sales, why is this?
  • We seem to have duplicate rows for some sales – is this correct?
  • Typically we sell £25,000 of this product monthly, but this month we’ve only sold £3,000, why is this?
  • We’ve received data for products that don’t seem to match our products reference data. Why is this?
Pro-active analysis such as the examples above, asking questions relating to data prior to kicking off the process that calculates commission payments will ensure that the process is run correctly, first time. It will allow detection of data anomalies, missing data feeds, and duplication that may have potentially gone undetected and caused inaccuracies in payment calculation.

Checkpoint 2: Has all data successfully navigated staging?

When staging areas are used, we need to ensure that data has correctly been transferred to the expected tables within the destination database. Simple volume & value checks may typically suffice, ensuring that the correct volume of sales have entered the system used, as well as the correct value of these sales.

Further checks relating to pattern matching to ensure conformity, or referential integrity checks could also be implemented at this stage.

Checkpoint 3: Have commission payments been calculated as expected?

This final checkpoint allows us to verify that the commission calculations have been completed as expected, prior to sending payments to employee bank accounts.

For instance, if you know that your sales force consists of 7,659 employees, you should see that 7,659 commission calculations occurred.

Through experience you may know that employees average a payment of £500 per month. If you see employees with a £10,000 payment you may wish to validate this prior to the payment hitting their bank account. Clawing back an incorrect payment could cause a lack of trust in the process.

At this stage you may also want to check eligibility criteria. For instance, has a commission payment been calculated for a part-time employee, or an employee on maternity leave?

Beyond the firewall

A data quality firewall can be a good case study to demonstrate the benefit of data quality initiatives within your organisation. This case study could be used as a springboard to gain support & funding to implement similar practice in other parts of the organisation, or to build a case for more strategic, organisation-wide data quality initiatives.

Friday, 22 January 2010

The Great Expectations of BI

I recently read a fantastic article by Jim Harris - "A Tale of Two Q's" on DQ Theory Vs Practice. The article quotes Charles Dickens, which led to my mind wandering off to one of my favourite Charles Dickens books, "Great Expectations". This in turn got me thinking about how the business often have great expectations of BI.

"Now, I return to this young fellow. And the communication I have got to make is, that he has great expectations."

My question is, Why doesn't BI, in many organisations, live up to these expectations?

1. It was oversold

"I had come into great expectations from a mysterious patron."

Often, BI is simply oversold within an organisation. It markets itself as the cure to all business problems. It will allow the business to quickly self-serve and extract good quality information. It will have efficient management of reports and will act as a 'one stop shop' for all information requirements.

However, it often manifests as quite the opposite.

The aim here is to manage expectations. Build steadily, and show how BI can add value within a particular function or process. Once we have success here, build upon progress made, and lessons learnt, into other areas of the business. Too many organisations over-sell what their BI implementation will actually deliver - perhaps to justify the budget spent on technology - and the business naturally get excited by the prospect it promises.

2. Tools don't do what the business want

"He says, no varnish can hide the grain of the wood; and that the more varnish you put on, the more the grain will express itself."

Although the marketing literature may demonstrate the flexibility and functionality of the tool, this may not always align with the practicalities. The business may not be able to use the tools in the way they wish, or may not even understand how to utilise the tools correctly.

Only by understanding what the business want from a BI tool, prior to it's selection, will we ensure that expectations have been both considered, and managed.

3. Data Quality issues

"Take nothing on its looks; take everything on evidence. There's no better rule."

No matter how much effort has gone into the presentational side of data, the simple rule is that if you put bad data into your data warehouse, you are going to get bad data out. BI is not going to magically solve data quality issues without serious effort put into quality control and governance.

4. Lack of Governance

"Mrs. Joe was a very clean housekeeper, but had an exquisite art of making her cleanliness more uncomfortable and unacceptable than dirt itself."

BI implementations should result in a centralised store of information - be it OLAP type environments, or reporting portals.

However, no matter how centralised this store of information is, if it's not being used, governed, and managed efficiently it is going to cause confusion. Although it may look clean to the untrained eye, the truth may actually be the opposite.

5. Slow turnaround on Information Demand / Supply

""So!" she said, without being startled or surprised; "the days have worn away, have they?""

BI should result in the ability for the business to self-serve information. However, OLAP cubes, complex reports & dashboards will need to be designed/built by skilled technical developers.

The expectation that BI will allow quick access to information can be hampered by poor turnaround in Information Demand & Supply. Do you have communicated SLAs in place within your Demand/Supply team, so that the business users have an idea of the timescales involved between requirements gathering, translation, build, testing, and go-live?

6. Poor Performance

"For an hour or more, I remained too stunned to think; and it was not until I began to think, that I began fully to know how wrecked I was, and how the ship in which I had sailed was gone to pieces."

If a business user suggests that it takes too long for a report to run, or query times to an OLAP cube are slow, they will often look for other ways to source the data they require. What performance testing and tuning has been conducted to ensure that the performance expectations of the business have been managed?


All quotes are from the Charles Dickens book "
Great Expectations"

Thursday, 21 January 2010

The essentials of data mapping

Understanding where data comes from - which systems it goes through, which business rules have been applied and at which point in it's journey - is critical to improving and maintaining data quality and governance. In large organisations data can often pass through multiple systems before it reaches a data mart used by the business.

I've seen a rise in Bottom up data mapping exercises being conducted, particularly within data-intense departmental processes. The below thoughts discuss this type of exercise.

"You don't know where you're going until you know where you've been" (Unknown)

In order to truly understand data, and whether it is fit for our purpose we need to understand where that data has come from. Not only the system it has come from, but also what has happened to this data along the way.

"The cause is hidden, but the result is known" (Ovid)

When the result is a data quality issue, we seek to identify the root cause of this issue. This can often be a time consuming process. However, by understanding where our data has been, and what has happened at each point of our journey, we speed up the analysis process required to identify the root cause.

"The landscape should belong to the people who see it all the time" (Le Roi Jones)

By understanding where data has been we can gain knowledge of systems it has passed through, and points where it was been 'touched'. The owners of these systems (or processes) should take ownership and responsibility for the data until they pass it along to the next system.

"I'm tired of chasing people" (Robert F Kennedy)

By defining owners we can reduce the amount of time spent chasing for answers. In many organisations there is a common situation where we have a data issue but don't know who to contact about it. We call a help desk and our issue gets passed around to 5 different people. A week later it's still not resolved.

A data mapping exercise can help improve (proactive) communication in two ways. Firstly end users will understand where their data has come from. Therefore they will know who to contact directly if an issue occurs - the owner we discussed previously. Secondly, if the owner knows who they are passing data to, they can alert that person of any known issues prior to passing the issue on.

Wednesday, 20 January 2010

Questions to measure BI DQ/DM success

Encouraging and reacting to feedback from the business is a great way to measure the success rate of Data Quality & Data Management efforts within your organisation.

The audience to target for feedback should be business users/analysts - those people that consume data to support business decisions on a daily basis.

There are a number of ways you could conduct your feedback gathering:

  1. individual interviews could be held with each selected party
  2. a group session could be held, perhaps a free 'brown bag lunch' to encourage attendance
  3. a web-based survey published via the company intranet
  4. similar to the last point, but an e-mail survey

From experience, the most effective methods I have utilised have been the group session - as not many people can resist the idea of a free lunch - and secondly the web-based survey. Individual interviews often didn't materialise due to moving of calenders and BAU activities taking priority, and E-Mail surveys were often flagged and forgotten.

High-Level questions to consider

1. Are you happy with the quality of the data that you use for reporting?

The leading questions should ascertain to understand the general opinion on the quality of data that is used for reporting purposes. The aim here is to allow us to benchmark business opinion against quality dimensions such as accuracy, consistency, timeliness etc. Is data easy to use? Is it presented in such a way that allows simple understanding and interpretation?

2. Do you understand the meaning of all business terms and definitions relevant to you?

If there is confusion around business definitions, we may have data quality issues. If a business user suggests that their definition is different than a 'standardised' definition it may raise the issue for review of the definition and any reports that contain the term. Does the same definition differ between users? If so, who is using the correct definition for reporting purposes?

Likewise, if the answer to this question is a resounding 'yes', it may suggest that all the hard work that went into driving a single business glossary, and publishing it centrally for the business to reference and understand, has been successful (well done!).

3. Do you think the business terms and definitions you use are consistent in meaning across the organisation?

However, if the answer to the last question was a resounding 'yes', and the answer to this question is a resounding 'no', it's clear that we have a problem. Ask for examples of inconsistencies. This suggests that a revisit to business definition workshops, where the right people are in the same room to come up with agreed definition for terms, and which data sources/reports these apply to, is required.

4. Who do you rely on for data provision?

Do the business solely use known data sources to provide them with the data they need? i.e. the data warehouse. Or, do they also receive data from other sources (internal or external)?

This is a great question to aid in the task of mapping out the complete information landscape of an organisation. Understanding where data comes from, which systems it goes through, and which business rules have been applied at which point in it's journey is critical to improving and maintaining data quality and governance.

It will also help to understand whether, over time, there has become an increasing dependancy on 'information silos' to provide data to the business. Information Silos are often under the desk affairs created by analysts utilising data from the data warehouse while undertaking further manipulation to the data before passing it on to interested parties.

Why do these silos occur? Perhaps because the data warehouse doesn't provide all the required information, or doesn't provide it in a format fit for the consumption of a business user. Only by monitoring information usage across the organisation will we be able to ensure standardised data governance principles are applied, and look to reduce this silo culture.

5. Who do you supply data or reports to?

Likewise, if business users supply data or reports to other parties (internal or external), we want to know about it.

Why are they supplying data? Is there a reason this data cannot be extracted from source by the recipient? If it's a report, why can't the recipient self-serve? Is it an education issue, or do they not have the correct tools? Or, is source data not fit for purpose, so data is being manipulated prior to being supplied?

If data is being supplied externally does it comply with legislation i.e. data protection act. Is the data sensitive? If so, have data security and encryption methods been taken into consideration?

Only by having knowledge of all supply/demand avenues can we ensure the quality and control of that data.

6. Are you aware of all the data related legislation relevant to your role?

Compliance is playing an increasing role in all business sectors, and it is important that the education exists for business users to understand how data related legislation impacts upon their role. In the case of Data Protection - are they complying to DP legislation when reporting on data? What do they need to do to ensure SOX compliancy?

In Conclusion

Measuring success of BI DQ/DM activities by speaking to business users is critical in order to gain an in depth understanding of the practical issues faced on a day to day basis.

The key activity is to translate and publish the results, laying out a roadmap for how negative feedback will be actioned and resolved. Generally, people like to see the results of what they have contributed to. As well as increasing awareness towards DQ/DM matters, it also helps to increase the likeliness of future participation & cooperation.

Perhaps you already publish a Data Quality Dashboard within your organisation, looking at the technical side of how data is performing against set quality standards & expectations.

However, the Chartered Management Institute suggests:

"You need multiple perspectives of performance to manage your business successfully"

I consider the same thing to apply to DQ/DM activities, so perhaps consider something similar to the balanced scorecard approach in order to get a rounded picture of performance. I will touch on my experiences with this approach in a future blog post.

Until then, go speak to the business.

Monday, 18 January 2010

Telecoms, Compliance & DQ Management

This blog post was inspired by a post that I saw today from Ken O'Connor: Achieving Regulatory Compliance - the devil is in the data.

I had actually planned to blog about some of my experience with Compliance in the Telecommunications Industry, and the role that Data Quality Management played in ensuring compliance, so thanks to Ken I started thinking about it a lot on the train home and have come up with my first post on the subject.

The European Union's Data Retention Directive

Or more formally known as "Directive 2006/24/EC of the European Parliament and of the Council of 15 March 2006 on the retention of data generated or processed in connection with the provision of publicly available electronic communications services or of public communications networks and amending Directive 2002/58/EC"

What does it mean to you?

Well, if you live in the European Union it means that telecommunication companies are, by law, required to retain certain data relating to customer usage for a set period of time. From my understanding, this varies depending on which country you are living in - Denmark has adopted further data retention policies than required, and Polish law, which came into effect in January 2010 states that Polish telecommunications companies must store data relating to customer activity for 2 years.

This Directive has come under some criticism from Privacy groups and in security circles, however, I am not a lawyer so I will talk about my experiences with the directive and how DQ Management techniques were implemented and adhered to in order to ensure full compliance.

Home Office code of practice

Since approval, and coming into force on 1st October 2007 the British Home Office code of practice recommends the following (simplified) data retention guidelines:

  • Subscriber Information should be retained for 12 months (Name, DOB, Address, Telephone Number, IMEI etc.)
  • Telephony Data should be retained for 12 months (Call from, Call To, Date, Duration, Location data etc.)
  • SMS/MMS Data should be retained for 6 months (From, To, Date, Location Data etc.)
  • Web Activity Data should be retained for 4 days (IP Address, URLs visited etc.)

How Data Quality Management was utilised

In order to be compliant, data must be stored accurately and timely from the initial point of capture in source systems, to subsequent transfer and manipulation between systems, through to any reporting that is undertaken upon the data.

A risk - a potential event or uncontrolled intervention in the data flow - which may affect the completeness, accuracy or timeliness of the data must be mitigated by ensuring correct controls are in place.

  1. Data Quality Profiling is undertaken to ensure completeness and consistency of data values and formats, as well as integrity of relationships between linked fields/records. Null values are checked, dates are subject to input masks and any incomplete/irregular data is sent to 'Suspense' tables where manual review procedures should be in place.

  2. Reconciliations between systems are undertaken (for instance, between Mediation & Interconnect). Within a telecoms organisation this may be the job of specialist Revenue Assurance teams, but support should be given by data quality specialists to ensure consistency, and communication.

    (In Telecoms 'Mediation' is the platform used where Call Display Records (CDRs) are collected from the Telephone Switch. Mediation then distributes this information the relevant downstream systems, such as Billing or Interconnect, which holds data relating to the interconnection and exchange of traffic between different telecommunications networks)

  3. Information Chain Mapping should be documented to ensure that information flow between systems, from source to target, is fully understand, including any modifications, changes in datatype formats, and if relevant, any further rules applied to data.

  4. Security & Access control policies are defined and adhered to (Data Management teams in Liaison with Risk/Security teams). Due to the nature of the systems involved, access should be limited to the service team responsible for operations of the systems, and only to development teams in emergency. this will ensure that risk of manual data loss/manipulation is minimised.

  5. If Data files are moved from system to system by FTP, or disk, File Checksum comparisons should be undertaken to ensure file integrity and completeness.

  6. Finally, in order to allow us to measure effectiveness, we should monitor performance of quality checks. If records were sent to a 'suspense' table - what was the reason? How were they resolved? These type of issues should be logged, resolved, and reported in a similar way to how you would manage other data quality issues within your organisation.

In Conclusion

Helping an organisation to be compliant is an area in which Data Quality professionals and teams can add huge value. By applying Data Quality Management Principles to Data Retention Compliance, you can raise the profile of both your team and of the need to think about Data Quality, and perhaps use this as a springboard to gain sponsorship for further initiatives within your organisation.

Sunday, 17 January 2010

What makes a BI reporting tool great?

I've seen numerous forum posts on places like LinkedIn where people have been asking questions like "what is the best BI reporting tool on the market?"

Quick answer: it doesn't really matter!

The technically-minded (and vendors) may now be spitting out their coffee and looking at the screen in disgust towards my lack of regard for 64 bit processing, deployment flexibility and single architecture. Don't get me wrong, these features are good features, however, in my mind, what makes a BI reporting tool a great business tool is:

1. the process that surrounds the tool, and
2. the people involved in making this process successful

I've come across organisations where they have fantastic BI technology, with all the add-on components money can buy, but it sits there largely unused by the business community because:

1. they don't know how to use it
2. it doesn't do what they want it to do, or
3. they don't trust the data

In order to ensure that your BI implementation does not go down the same well travelled path, here are some critical success factors to think about:

1. Listen to the business

I cannot stress enough how important it is to listen to the business, and how often it is overlooked in favour of simply migrating current data environments to map straight to the new BI reporting tool. You know what? Yep, you guessed it. New technology, same old problems.

The business hold the key to a successful business intelligence implementation. The business are the customer, and they know what they want to achieve from the implementation.

Back in the early 2000s while I was working as a credit risk analyst for a mobile telecommunications billing company, the company had invested a massive £5 million into building a new data warehouse and reporting environment. Upon go-live it was clear that the project had failed. One of the 3 major reasons for failure was that the business felt that the implementation didn't deliver what they wanted, due to the business not being consulted in enough detail. The project team simply misunderstood the business requirements. (the other major reasons were poor data quality, and poor query response times)

Last year I conducted a BI capability exercise as a consultant for another major telecommunications company. We were sponsored by the CFO to interview business users to ascertain what the business wanted from a BI tool prior to investment in new enterprise-wide BI technology.

We arranged focus sessions with a large sample of employees, ranging from technical analysts, to product managers and heads of departments from across the business in order to get a representative view of how different people extract and consume data. We spoke to them about:
  • what they wanted to report on
  • how they wanted to use the tool
  • how they wanted information presented
  • how they wanted information delivered
  • concerns and issues with current BI landscape
  • what they would want in an ideal world
This information is crucial to ensure that you can deliver a strategic BI Implementation which the business will embrace, support and utilise.

2. Ensure that there is correct metadata attached to your data

If the business don't understand the data, they are not going to use it, or they are going to use it incorrectly. Inaccurate metadata, or worse, a total lack of metadata is a key contributer to poor data quality. Poor data quality is a key contributer to misinformation. Misinformation leads to poor decisions, and poor decisions can impact the bottom line.

You wouldn't attempt to speak a foreign language without first consulting a dictionary, so why should business be any different?

What is good metadata in the context of metadata for a business user?
  • A meaningful name of the data (ie. the report name, or OLAP cube name)
  • Defined business terms for each field within the data
  • Information relating to data owner (who owns the report, or the cube)
  • Business rules and any criteria applied to the data are clearly defined
Metadata should be stored in such a way that it is easy to access and view. Perhaps, kept as a 2nd tab of a report, or on a Wiki, or even a centrally-stored spreadsheet. It should allow a business user to understand exactly what data is trying to tell us, and in what context.

3. Ensure effective information demand & supply processes exist

When I say 'Information Demand' I am talking about the process of how a business user would request information. This process should be supported by a strong team of analysts who can translate and prioritise business requirements into reports, OLAP cubes etc. Ensuring they capture things such as:
  • identification of metrics/dimensions required
  • business rules to be applied
  • presentation (fixed format, dashboard, cube)
  • delivery (frequency, method)
  • known issues (data quality, lack of robust data feed)
As well as being able to understand the business requirements, the analysts should also understand the technical capabilities of the enterprise architecture to aid the delivery of the correct data, defined in the correct way, first time.

When I say 'Information Supply' I am talking about the process of building the report or the cube from the business requirements ascertained by the analyst.

A large problem with a number of BI Implementations is that business users state that "it takes too long to get new data" and that "BI report development takes too long, so I look around for another method of getting the data I need".

Effective Information demand & supply processes - involving strong communication to bridge any gap between the business and IT - are required to maintain a good level of service to the customer, and ensure that these statements do not apply to your BI Implementation.


I'm going to end this post here without touching upon the need for data quality profiling before embarking upon a BI project, because you undertake data profiling as standard for all data-related implementations and migrations... right?

Thursday, 14 January 2010

How to ensure efficient use of a reporting portal

Trying to name this blog presented me with a challenge. Upon registering with blogspot you need to give a web address (* to your blog. I endured 12 unsuccessful attempts to move into the world before I finally settled into my new home at

During my 12 unsuccessful attempts, only 1 of the blogs was an active blog with a reasonable amount of content. The rest were either blogs that haven't been updated since 2006, or blogs with 1 or 2 short posts in, not having been touched since the day of their creation.

This got me thinking about Business Information, and in particular about Reports that are published to some kind of Portal. The majority of organisations I have worked with tend to utilise some form of centralised portal that business reports are published to. This could either be a web portal on the company intranet, or a self-service portal that forms part of an enterprise-wide BI tool, such as Business Objects or MicroStrategy.

These portals are an essential component of the BI Landscape, and a great information delivery method to allow the business to serve itself with accurate, timely, insightful information, without the need for further IT intervention, delays, or confusion.

However, what I have seen in many organisations is that due to rapid growth and lack of governance many of these portals have become free-for-all dumping groups, with little structure or consideration to whether reports have been validated to contain accurate, timely information. They may contain duplicate versions of the same reports, test reports, or even reports pointing at redundant data sets.

How can the business users know that the report they are viewing is accurate, and contains valid information? Often, they can't.

This can result in a lack of confidence among business users:

How can we, as BI / Data Management practitioners, ensure that the portal method of information delivery breeds confidence and remains an efficient delivery method of information to business?

1. Monitor 'last refreshed' or 'last accessed' date

Most portals will give you the ability to look at dates that reports were last refreshed, or even when they were last accessed. This is beneficial in a number of ways.

Firstly, in the case of 'last refreshed' dates, the team (or person) responsible for the portal should look at introducing proactive monitoring of reports to ensure reports are running as, and when, expected. In parallel to this you should be educating business users to look at the date, and if a report is supposed to be daily yet hasn't been refreshed for 3 days, provide them with an E-Mail address or telephone number to contact to raise an issue, perhaps a BI helpdesk that will look to resolve their issues. It would be worth introducing SLAs (Service Level Agreements) to aid business users. For example, communicate that 'all daily reports will be refreshed by 9am, all weekly reports will be refreshed by 9am Monday morning'. Communication of any delivery delay or failure is essential in gaining business trust and confidence.

Secondly, a pragmatic exercise I have previously undertaken is to look at all reports on the portal, and temporarily remove all reports that have not been accessed within the past 3 months. These reports have been fenced off into a temporary storage area for 3 months, and if a business user does not complain that their report is missing, the report has then been permanently deleted. Due to the ever changing nature of business, reports are often created for a specific purpose, used heavily for a period of time, and then no longer deemed necessary, or are superseded by a new report. Often these old reports remain on the portal, taking up space and in some cases confusing users.

2. Introduce increased governance

The portal may have started with the best intentions in terms of governance, with the initial batch of published reports having defined owners, purpose, scheduling, definition etc. However, as the portal grew, this governance benchmark may not have been kept up, hence the reason business users are confused and test reports are being published to the portal.

An exercise of re-introducing governance standards to the portal, as well as providing education for the standards of future reports that will be published will help ease business concerns such as the questions I highlighted in the figure above.

BI Analysts should work with business users in order to ensure governance standards are maintained. For the purpose of Reports, I identified 4 streams that should be addressed in order to ensure efficient governance and business understanding of reports.

  • Purpose
    - Identify why report is needed (operational, KPI, KQI)
    - Identify what the business need is
    - Identify how it will be used

  • Ownership
    - Who will own the report?
    - Who is the primary contact?
    - Are they also responsible for potential report issues?

  • Classification
    - Is the report sensitive?
    - Does it require special permissions to view?
    - Is a confidentiality agreement needed?

  • Definition
    - What do the fields on the report mean?
    - Who has aided in defining the fields?
    - Has definition been signed-off as correct?

These 4 streams form the basis of a 'Standards in Report Creation & Publishing' document that could be introduced, and communicated across the BI/Analyst/Business community.

3. Introduce Kite marking

A further step into the world of report governance could be taken in the form of kite marking reports. Kite Marking is the process of stamping a report as a recognised source of accurate and approved data. A report could only achieve the kite mark standard once it has satisfied the standards defined in point 2 above, and once kite marked the business users would know that the report contains information of which they can be confident will support their business decisions.

4. Allow search of reports by keywords

The majority of portals allow for users to search based upon keywords. The reports on the portal should be named, or contain 'meta tags' for searching, so that business users can easily find what they are looking for. Often business users will request a report to be designed when there is an existing report on the portal that would provide them with the same information, however, they did not know this report existed. Educate business users to become proactive in searching for information on the portal, prior to raising a request for a new report.

Ensure that metadata exists that allows a business user to easily understand, in business terms, what the report shows, and how it should be used. Following on from this, if they have any questions or concerns, you should empower them to communicate with the owners of existing reports by ensuring that contact information for report owners is clearly displayed. Report owners are the 'subject matter experts' for that particular report, and we should ensure a culture exists that enables business users to utilise these subject matter experts to increase knowledge and confidence in information.

5. Ensure accurate metadata is in place

As mentioned in the previous point, it is critical that accurate and concise metadata exists for each report. This will allow a business user to easily understand what a report is showing, and how it should be used.

When I talk about metadata, I am specifically referring to metadata from a business perspective, such as:
  • A meaningful name of the report
  • Defined business terms for each field of a report
  • Information related to report owner
  • Business rules and any criteria applied to the report are clearly defined
These forms of metadata, where applicable, should be captured either on the report, or within a separate business glossary.

Due to the ever-changing nature of business, ensuring accurate metadata can be harder than it seems. Over time definitions and business rules can change. It is therefore essential that a regular routine exercise of maintaining metadata is undertaken. A report with out of date definitions or business rules could lead to data quality issues and poor decisions.

6. Introduce structure to portal

Structure is essential to providing an efficient portal to business users. There is nothing worse than having to scroll through a list of 400 reports to find the report you're looking for.

There are a number of different ways that a portal could be structure to improve efficiency to business users. For instance:
  • Reports could be stored by subject area, such as 'finance', 'sales', 'supply chain'
  • Reports could be stored in a 'daily', 'weekly', 'monthly' directory structure depending on how often they have been designed to be refreshed.
  • A combination of both
Think of the portal as a bookshelf, or a library, with the aim of enabling the business user to find the correct information they require in a timely manner.