Similar to my initial thoughts on my Spreadsheet Nation post, I jumped into this topic of Data Quality without really testing the waters. In this case, I thought I could just jump in, rhyme off some platitudes about Garbage In, Garbage Out (GIGO), and go on my merry way. Instead, what opened up to me was a vast sea and I was a fish out of water. I was standing on shore clueless about what lay beneath the surface.

Malaspina Strait, British Columbia, Canada
Data Quality really is one of those topics that tends to lurk under the surface – elusive to capture. We are talking about “the state of completeness, validity, consistency, timeliness, and accuracy that makes data appropriate for a specific use” (definition courtesy of the Government of British Columbia). Or if you prefer, there’s the Dragnet definition: “Just the facts”. For accountants, we are talking about all that stuff we enter into our systems (or gets generated by other systems) that we need to access later for producing reports and analysis. The Data Quality refers to how effectively we can gain access to and generate meaning from these volumes.
A great deal of energy tends to go into our design of ways for inputting data. How much thought has gone into the processes designed for getting the data back out?
According to IDC, a leading technology research firm, very few companies have systems in place to make use of their data, and [they] often struggle to classify data in order to find it again. There’s a great quote on the V3 blog from Benjamin Woo at IDC:
“The key is to take the data and make money from it”
I think that this frames the issue in language we can understand. We incur costs for gathering, processing, and storing data. We may even incur further costs cleansing, reworking, and managing the stores of data. What does the data do for us? Are we developing an asset that creates future value? Or, are we plugging an expense?
As I alluded to with my “fish out of water” comment, the answers to these questions are deeper than can be fathomed in this brief forum. Today, I would like to simply skip a stone across the surface from the safety of shore.
An introduction to the formal world of Data Quality is the real goal for this post. I’m not the expert. These guys are the experts (a couple of them anyways):
- TDWI: The Data Warehousing Institute is where business and technical professionals come together to gain knowledge and skills through education and research programs relating to the Business Intelligence and Data Warehousing Industry. These guys are leaders in the field and have a ton of resources you may find useful.
- IAIDQ: The International Association for Information and Data Quality is a not-for-profit, vendor-neutral professional society of people passionate about improving information and data quality. They have a fantastic glossary of terms you may find very useful!
These two groups provide a jumping off point. I don’t think, as accountants, we can be expected to become Data Quality experts. The constraints of time and inclination stack up against it as well they should. But, I do think that it’s in our best interest to familiarize ourselves with their world a bit so we can speak intelligently about these matters and gain some measure of insight that can help produce more value from the data we compile.
One quick example
Domain Value Redundancy: A dysfunctional characteristic of an attribute or field in which the same fact of information is represented by more than one value. For example, unit of measure code having domain values of “doz,” “dz,” and “12″ may all represent the fact that the unit of measure is “one dozen.” (Larry English)
What input fields in your systems give the user discretion with respect to the input values?
I live in Mount Pleasant.
I live in Mt. Pleasant.
I live in Mt Pleasant.
You can see how, once these various spellings get into the database, it becomes much more difficult to generate aggregate data without going in and mucking around. Getting it right the first time is a key issue, but that’s a topic for another post.
Parting Shot
Here are some fun facts to leave you with today, just to give you an idea about the nature of the Data Quality professional:
- Wednesday, November 11, 2009 was World Quality Day. I bet you didn’t know that. World Quality Day was established by the United Nations in 1990 to raise awareness of how quality approaches can have a tangible effect on business success.
- Right now among the Data Quality community, they are engaged in a “Blogging Olympics” dubbed, “Three Single Versions of a Shared Version of the Truth”. My favourite post so far is the one arguing that the “single version of the truth” mindset is inherently flawed and should be considered the “one lie strategy”.
These are the guys we need to engage. Enjoy!

November 18, 2009 at 5:38 am, Jim Harris said:
Nice post Geoff,
“Asset or expense?” is a common concern and an important question for data quality.
You are right to say that we cannot expect everyone to become data quality experts, especially with the constraints of time and inclination. And not only is it good for everyone to familiarize themselves with the world of data quality, but we data quality practitioners need to do a better job of sharing the basics as well as learning more about your worlds as well. Producing more value from data takes a collaborative effort from all involved.
The resources that you listed are excellent. I would add Data Quality Pro to the list:
http://www.dataqualitypro.com/
It is the leading data quality online magazine and independent community resource dedicated to helping data quality professionals and all those interested in learning more about data quality.
Best Regards,
Jim
P.S. Thanks for the mentions and the links
November 18, 2009 at 5:57 am, Geoff Devereux said:
Wow! Fantastic comment Jim, thanks!
Perhaps accounting and data quality is heading for a meeting of the minds.
To our readers, listen to Jim, he knows what he's talking about.
You can find more of his insights at: http://www.ocdqblog.com/about-ocdq/
OCDQ = Obsessive Compulsive Data Quality
Thanks again, Geoff
November 18, 2009 at 3:41 pm, Dylan Jones said:
Hi Geoff,
Great post and I see Jim has beaten me to be the first commenter (as usual!) so thanks for sharing the link to Data Quality Pro Jim and I really encourage everyone to begin with Jim's http://www.ocdqblog.com as there are excellent, practical articles to be found there.
I think it's fantastic that you've started a dialogue on this topic and expressed some of the issues from your side of the fence and in terms your readers will understand and value.
Totally agree with Jim's comments, we all have to do far more to simplify the best-practices in our discipline so that they become accessible to all.
The reality is that the rules of data quality are quite simple. They often appear far more complicated and long-winded when they find their way into books and academic papers.
If you take your case above of "Domain Value Redundancy", once you're listed some examples it is really clear how this rule can be violated and the sorts of impacts that can occur.
So, I've taken an action away to publish a series of simple examples of what data quality means for the newcomer, huge thanks for being the catalyst for this and let's keep the dialogue going.
Best regards
Dylan Jones
Editor – Data Quality Pro
November 19, 2009 at 5:28 am, Geoff Devereux said:
Thanks for the positive reinforcement and the comment!
There is a ton to talk about in this area and translating for the end-user will be critical.
Cheers
November 18, 2009 at 6:18 pm, Charles Blyth said:
Great post Geoff,
I came into the Data Governance world from Accounting, having become so very frustrated with the quality of the data that I had to rely on. The great old adage of "If you want something done right, do it yourself!"
The concept of 'making money from your data' is a key factor in the success of any Data Governance program. It is common practice today to speak of data as an asset in your organisation, businesses now need to take that further and start using that asset to create wealth rather than support wealth creation.
Thanks for the link to our Olympic blog bout, even though you voted for the wrong person
Cheers
Charles
November 19, 2009 at 5:24 am, Geoff Devereux said:
Sorry I couldn't vote for your post. I was drawn in by the "one lie" stance. The real surprise for me was that the discussion (whether a single version of the truth is possible) was actually taking place. I hadn't heard the argument before, so it's interesting.
There's definitely some simpatico though now that I know you have the accounting background. I find, as accountants, we deal in data all the time, but don't always recognize it.
Thanks for the comment!
November 18, 2009 at 7:21 pm, John Owens said:
Thank you Geoff
Very nicely put with some apt examples.
A major problem is that businesses do not know what data they should capture and why. So fundamental and yet so misunderstood.
This is because data analysis and modeling is most often done in isolation from business modeling. The key business model – the Function Modeling – will enable a business to know exactly what data to gather, what the structure of this data will be, why is is being capture and how it will be used.
For more on Function Modeling got to http://www.integrated-modeling-method.com
Would also recommend http://www.dataqualitypro.com/
I would argue that there is no need for every business person to become a data quality expert in order to expect high quality information form system providers. They would not expect to become telecoms experts before they can be provided with a quality phone system.
They might, perhaps, learn some of the basics to avoid getting bamboozled by jargon. Hint: If they do learn some basics they really must avoid getting into "jargon fests" with others. Otherwise, they will all be speaking a foreign language to each that none of them understand, yet all convinced that they are fluent!
Remember, if the "experts" cannot describe it in plain English, they will be unable to deliver it!
John Owens
November 19, 2009 at 5:19 am, Geoff Devereux said:
Jargon Fests, I love it. I think we're all guilty of that at times. You might like my Language Barriers post.
Thanks for the comment!