The Management of Oil Industry Exploration & Production Data

Who are we? Purchase Options - B&W USA/World - B&W UK - B&W Germany - B&W France - Color USA/World - Color UK - Color Germany - Color France
1 Introduction 12 Physical data
2 Value of data 13 Documents
3 Subsurface data 14 Auditing
4 Current practice 15 Quality
5 DMBoK 16 Other elements
6 Governance 17 Assessing
7 Architecture 18 Glossary
8 Development 19 Figures
9 Operations 20 Bibliography
10 Security 21 Index
11 Corporate data 22 Further info
Upcoming events New articles Extra material Links
Sample chapter Figures Bibliography Extra material Historical Papers

by Steve Hawtin
21 Jul 2015

The dragon of data quality

Last week I suggested that it is important to keep both a readable version and an executable version of all of your quality tests. Those, like me, who have a naturally cynical bias will have spotted that I was a bit vague about exactly what sorts of things a "quality test" should contain. Partially this was because the expert users should be allowed to dictate what form the intent statement takes. If they want to incorporate a new ideas they should be allowed to do so, as data handling experts our role is to ensure we understand (and document of course) any novel concepts employed. Suppose an expert user wishes to add a test stating "all toves that are brillig should have a gimble value of more than 3", well that English sentence might be completely clear to the right type of practitioner but the rest of us will need a bit of explanation. As I said the need to allow domain experts to have the final say is absolute, but it is a bit of a feeble excuse. As data handling experts we should be able to articulate the majority of the concepts that are encountered in lists of quality tests before interviewing our first technical data expert.

The Jabberwocky illustrated by John Tenniel

So, what would I suggest needs clarification before starting a "Data Quality" initiative? Well, the requirement to agree the "dimensions of quality" is obviously a good starting point: completeness; uniqueness; consistency; timeliness; measurement accuracy; and reasonableness for example would be a realistic list of dimensions to focus on. The exact list can be discussed as long as everyone has agreed before we start writing rules. Some concept of the different data groups, both so we agree where to place relevant quality tests and what to call business objects. It's no good striving to detail what names of wellbores look like if we haven't first agreed what a "wellbore" is. We also need to have a rough idea of what the data is actually for, that is the business reason for keeping it, after all the goal here is to have data that is "fit for purpose" spending extra time or resources to refine values beyond the necessary is a waste. Understanding exactly where the data is to be found is important, it's hard to execute tests if we don't know have a path to what is being tested. There's also a concept of the "role" that each data location has, some data may be scratch information, used for trying out ideas, while other data may be the "official", "approved", "corporate" version and hence subject to more restrictions (and, one hopes, more trustworthy as a result). Having an agreed list of "data location types" is an essential step to controlling the quality. Once we've found the data, and we know how much we should be able to rely on it, then we can look at its structure (which elements are present, what attributes they have and so on). Often one key type of test is to check the conformance with predefined lists of valid values (usually called "reference lists"). Finally there is the concept of the "profile", that is some kind of description about different "ways of working", perhaps asset teams use different software, or national affiliates need to employ distinct business processes to conform with regulatory agencies. Often there will be a "corporate profile" that sets quality tests across the whole organisation and additional supplementary rules that apply in particular locations.

You need quite a bit of preparation to be properly equipped to face the dragon of data quality.


prev icon
Article 60
 
paper icon
Articles
 
rss icon
RSS Feed
 
news icon
Updates
 
home icon
Intro
 
toc icon
Book Contents
 
figure icon
All Figures
 
biblio icon
Refs
 
download icon
Downloads
 
website icon
Links
 
buy icon
Purchase
 
contact icon
Contact Us
 
next icon
Article 62
 

Comment on the contents of the 'The dragon of data quality' page
Subject: Email to Reply To (optional):