In the last edition of Technology Update, we commenced a series of articles on Data Analytics and outlined how big data can be used to identify areas of cost reduction or productivity gains. Our article collaboration through data outlined how data is used by various groups within a Resource operation. We were surprised to find in our industry survey that each group operates and maintains its own system, which led to inefficiencies when endeavouring to solve production issues.
All this data but what to do with it?
We learnt that collaboration was limited; however collaboration through data is only one part of the toolset required to draw conclusions for measured business decisions. To maximise the data collected, it must be ‘classified’ so as to further enhance data analytics … vital analytics that can drive business intelligence. This article focuses on data classification as an additional tool to help increase operational efficiencies.
There’s data … and then there’s DATA
Any statistician will tell you that there are only two types of data 1) Categorical; and 2) Numerical. This is important because it denotes the level of detail that can be applied to a measurement. For example, categorical data can only be sorted into groups, groups like eye colour and gender. The level of measurement detail that can be given to gender for example is either male or female and the number of each. Nothing else can be quantified. One the other hand, numerical data can be sliced and diced into very specific quanta of information. For example, age and height are types of numerical data, and when put together with a gender dataset can infer an average weight of a population. The same logic can apply to Resource data. Combining categorical data (eg pump type) with numerical data (eg RPM) can infer other information such as performance. Gradually a picture (population) can be constructed and within that population a ‘frame’ can be formed to infer a statistic(s) about the population, like pump life. This is called inferential statistics which leads to predictive analysis.
Classifying data in a Resources context
Classifying data provides the end user an additional tool set so that richer insights can be obtained about an equipment, process or service. Since data comes in a few flavours and from many sources, a clear strategy is required to classify data so that it can be incorporated into a business analytics model. There is a saying that goes, ‘let the data guide your thinking and not your thinking guide the data’; but there is a caveat. To get the right information out of data, the data must be classified correctly which can then lead to a clear understanding of the situation. And, of course you need someone with data analytics skills who can translate the findings into everyday language for the business. Understanding where the data comes from gives an indication of how to use and display it. The following Figure 1 shows some data types and sources.
Now that we know what kind of data there is available to collect, we can start to put it to better use. Here is a typical industry scenario to set the scene. In one database, a company collects information about a Resource collection site, information like when it was built, how long it has been in operation and the equipment at the site. And in completely separate database, information is collected about the equipment, like how many faults, type of faults, run time and so on. In another database, information is collected about gas flow, pressure, temperature etc. Now suppose this information was correlated (in reality it is not), what additional information can be obtained about the equipment, process or service? Let’s say a gas Wellhead was installed on the 15/8/2013 and it has been operational for 9,720 hours. During that time it has been maintained two times, and the rotor in the acme liquid ring vacuum pump was replaced twice. During the life of the Wellhead it produced 480 TJ / day with a loss of production of 10 days of production (12 TJ).
Applying the concepts
In the above scenario the Wellhead data would sit across three databases:
- Asset – categorical;
- Gas flow – numerical; and
- Maintenance – categorical and numerical.
Now let’s ask some questions.
- How many times has the acme liquid ring vacuum pump been replaced at other Wellheads?
- What has been the production loss at other Wellheads with the same acme liquid ring vacuum pump?
- What is average TJ per day at other Well sites will similar pressure and use the acme liquid ring vacuum pump?
Knowing where the data comes from, classifying it and statistically combining the data will highlight any abnormalities within Wellhead population.
Interpreting and visualising the results
Consider that in the above scenario this type of data is collected thousands of time over, so a picture can be built about the Well population, ie a normality picture can be built about a Well site, area and region. Below in Figure 2 is an example of finding normality (or not). A picture like this can be built from any data numerical data source, ie gas production per Wellhead and maintenance frequency on specific items. Unfortunately this example, it shows that pump failures are the norm. If that were the case, it shows that Acme pump E and D would have to be reconsidered as a product line, or is it a bad batch of Acme pumps? A similar approach can be made with purely categorical data by using a statistical technique called ANOVA. An ANOVA can find variations between items, what has changed and why. Using other statistical tools, in depth analysis can determine other issues such as common problems within a population, and this can be fed into maintenance schedules. And finally the steps
Understanding data trends is a skill that should be valued, and it starts with basic statistics. Statistical analysis is just the beginning; being able to articulate the results into a business context, is the key. Data analysis is a process, and there are key steps in being able to obtain business intelligence that supports sound decision-making, and they are:
- Consolidate (collate) the data
- Categorise the data
- Analyse the data
- Apply the results
By combining and accurately categorising the data, instantly you can start to pinpoint underperforming production segment (Wellheads) and problematic equipment. The analysis of data can lead to benchmarking and trending, and trending leads to prediction.