The oil and gas industry is increasingly using data to make better decisions on a daily basis. From reservoir characterization to drilling operations, big data analytics is gaining more importance in the industry. More and more companies are embracing this new way of extracting knowledge from downhole sensors in making better decisions and minimize non-productive time (NPT).
Rigs provide massive amount of data to help drilling engineers optimize drilling efficiency, minimize NPT, and their associated costs. Learning to use data efficiently to improve drilling is a discipline that requires a combination of different skill sets including oilfield experience, statistics, programming, and effective communication. This interview explores this emerging discipline—the opportunities, challenges as well as what young professionals (YPs) need to know to have a rewarding career in drilling data analytics.
To understand what’s involved in the daily life of an analytics engineer, the skillsets required for the job, and how to make this transition, the TWA HR Discussion team interviewed Peter Kowalchuk, senior product manager, Halliburton Digital Solutions, about his experiences working with data discovery, visualization, interpretation, analytics, and decision making.
Peter Kowalchuk has more than 20 years of experience in the oil and gas industry, from field positions in both wireline and logging while drilling, to operations management, support, and research & development. He has participated in a variety of projects which span the complete data life cycle of the well construction process. In recent years, he concentrated on producing and developing solutions using the latest techniques in data analysis and digital solutions. His areas of interest are data aggregation, data flow, machine learning and algorithms, and workflow design. Kowalchuk holds a bachelor’s degree in electronic engineering, an MBA from Texas A&M University, and a certification on data analytics from Cornell University.
How would you define data analytics?
Answering specific engineering and/or business questions using measured sample data from a population we are trying to describe.
What does a “day in the life” of an analytics engineer entail?
It really depends where in the analytics process the person is working. It is true much of the total time devoted to analytics is spent wrangling data—sourcing, gathering, aggregating, cleansing. But there are other aspects which are equally important: distilling the question to answer, determining the right technique to use, designing algorithms and building models, programing algorithms, testing the models, and sometimes most important of all explaining and selling the model.
The terms “data analytics” and “machine learning” cover a broad range of activities. How transferable are skills within these disciplines?
Machine learning is a toolbox we use to execute data analytics. It is part of what more broadly is known as Artificial Intelligence, which also includes other sets of tools such as Knowledge Systems. What tightly links machine learning and data analytics is that in general terms they both live in the high data-volume space. So being able to see a story, a pattern for example, in a dataset is probably the most common skill between the two.
The oil and gas industry is noted for being slow to adopt recent technologies. The popularity of Big Data, data analytics, and machine learning are all relatively recent developments that YPs may not have had exposure to in the industry. Where can a YP turn to learn more about them and develop these skills outside of the classroom?
Much of the basics behind this wave of analytics focus has been around for several decades, but some has been enabled by recent advances in processing power and data storage. So it really depends of where the desired focus is. If the desire is to learn more about the algorithms that run many of the analytical solutions, a good starting point is learning basic statistics. There is a plethora of material online, but any textbook on the subject is also a good place to begin. Look into sampling, distributions, hypothesis testing, and regressions.
The next step is to look at machine learning techniques, such as clustering, classification, non-linear regressions. Again, online material is abundant. But what I would really suggest before going too far in-depth, is thinking about a true real-world problem that needs an answer, and then look for a method that best addresses that issue. There is no better way to learn than having a problem to solve.
What are some of the greatest challenges facing the adoption of analytics and machine learning in the oil and gas industry? How can YPs equip themselves to address them?
Black Box Syndrome plus way too much talk about the subject. It is really amazing how many people are talking about the subject, but I do believe it is really hard at the moment to find substance among all the chatter. Seems that there is a solution for any problem that turns out to be a black box with a secret algorithm. A few years ago we used to “have an app for that,” now we “have an algorithm for that.” I think eventually the dust will settle, and the world will move on to the next big thing… maybe blockchain. Once that happens true adoption of analytics will become mainstream in our industry same as in any other industry. We are simply in the first phase of adoption, we need a bit more maturity, and it will come. Also to keep in mind, our industry isn’t as slow of an adopter as many suggest. Many of the techniques being talked about in the mainstream have been used in subsurface disciplines for many, many years. Can anyone build an earth model without using statistical analysis? Is there such a thing as a petrophysicist who has never run a regression analysis? Unsupervised learning clustering, a geologist might call it a lithofacies model.
If a YP is interested in moving their career in a data analytics related direction, what opportunities should they be looking for to gain relevant skills and experience?
Find a hard problem to solve which has no direct solution using knowledge systems and work on building a model for it. This can be in almost any area of the industry. If there is one thing the mainstream has got right is that analytics can be used in many fields. That is the essence of it. Being able to use general models to solve specific problems is one of many benefits of this discipline.
Are skills related to data analytics transferable between petroleum engineering disciplines? Would a drilling analytics engineer make a good reservoir analytics engineer and vice versa?
For sure; applying analytics in one discipline or another is very interchangeable. Learning the underlying science for each discipline is the main hurdle. With this in mind I wouldn’t recommend data scientists blindly apply their knowledge to each individual field, but rather to pair up with a subject matter expert [SME]. So if one is an SME in drilling, playing the role of the analyst and field expert can be done by the same person, but if the problem to solve is now subsurface, we should seek that expertise before embarking on that journey.
If a YP develops skills on the side (e.g., self-taught programming) but doesn’t have formal qualifications/job responsibilities involving them, how can he/she demonstrate their competencies, especially to prospective employers?
You need to be able to prove results. Use analytics, programing, and other self-taught skills to solve problems in your current role, and then highlight results when seeking the new position. Nobody really needs a PhD degree to become an analyst; the field is very broad, and there is plenty of room for self-taught experts to contribute.
What do you consider to be some of the most rewarding aspects of work in data analytics?
Finding solutions to problems we have(n’t?) been able to solve. Finding a story behind the data and being able to model the world around us. No different than with knowledge systems and traditional science, but here we have the extra reward of being able to use techniques in very different fields to achieve results. Analytics is about having a toolbox with a limited number of techniques to solve very diverse problems. Finding which of our tools can solve the task at hand, and seeing the results, is a truly satisfying feeling and a true morale booster. Makes us think we can solve anything with what we have in our toolbox.
Access to large data sets is a necessary part of data analytics and machine learning. Often these are proprietary and access to them may be limited by their owners. While this may be beneficial to those who own the data, it may be detrimental to the industry as a whole. Are you a proponent of data sharing? If so, how can the industry encourage the sharing of large data sets? If not, why not?
More and more we hear at conferences the importance of data sharing. I don’t think it is a problem with an easy solution. One just needs to see what’s happening in other industries around data privacy. Even the big players such as Facebook and Google have similar, in their case existential, questions to answer. I don’t think a free world where all the data are shared is possible. There are many legitimate reasons why we should and shouldn’t be open. Answers are often specific to each situation. But this shouldn’t be a show stopper for analytics. It’s just another challenge, which once overcome will bring great rewards to those who decided to tackle it. Folks in the video industry seem to have found data analytics recommendations systems which generate profits for them even though they only work with single digit percentages of the entire possible dataset. One can argue these leading implementers of data analytics thrive not because they have a lot of data, but because they have found meaningful solutions to problems where data are scarce. I’m sure our industry can do the same.