If you have ever asked a question of Siri on your iPhone, or Google Assistant on your Android, you have some experience with natural-language processing (NLP). If you have ever settled a billing dispute with an automated online assistant, you have some experience with natural-language processing. It is a type of artificial intelligence (AI) that deals with the interactions between computers and human language, analyzing the thousands of words we use at any given time to help us draw insights and make decisions.
The science behind NLP is not new, but in recent years it has become the primary driver of many forms of AI, and companies in a variety of industries are trying to maximize its potential.
How does NLP fit in with oil and gas operations? Speaking at the inaugural SPE ENGenious Symposium in Aberdeen, Philippe Herve, vice president of solutions at industrial AI company SparkCognition, said that NLP-based models can serve as a cognitive output that predict future operational behaviors and alert operators to potential hazards in the field, helping them craft more effective safety protocols. The key, however, is in developing those models to make sense of the data.
“The knowledge is here,” Herve said. “The question is how can we capture this knowledge and pass it to people who can build new platforms? That’s the big challenge.”
NLP has existed in its modern form since the early 1950s, longer than the study of artificial intelligence, but Herve said the recent machine learning boom has brought it to the forefront of business thinking. This is partly because machine learning centers around writing algorithms that can learn beyond their initial programming, rather than being constrained by the rules coded in them.
Instead of trying to hand-code every rule of human language, programmers feed text into a machine learning program and let it figure out the rules for itself, using probabilistic models to determine word usage and context. To improve the model, the programmer can feed it more text and allow it to learn as a human might.
This function of NLP programs helps companies organize unstructured data, or data that are not organized in a predefined manner. Existing analytics tools can handle structured data because it typically contains only numbers and categories, which can easily be represented in a spreadsheet. However, this does not represent the bulk of data produced in the field: Herve cited a 2014 study from IDC indicating that only 22% of all data are documented well enough to be analyzed, and that only 5% of all data produced from the field are actually analyzed.
Unstructured data can consist of things like PDFs, books, journals, process logs, sensor data, public records, audio, video, and a host of other sources primarily meant for human use. Herve called this information the “lifeblood of organizations,” because if leveraged properly, it could provide greater insights into an operator’s business. From an oilfield safety perspective, this could mean having a system to analyze a database of near-accidents that happen on a particular project site, making it easier for operators to notice trends and plan for them.
Herve said analyzing unstructured data at scale can be difficult for companies because it often means incurring significant expense to scale or improve the processes needed to structure their data.
“Finding meaning in unstructured data is really complicated because you have all of these silos,” Herve said. “This is one of the challenges in many places. It’s very expensive to try to move [data] from unstructured to structured. It’s much cheaper to try to make something out of what we have.”
Herve mentioned his company’s DeepNLP as an example of an NLP platform that designed enable automated workflows of unstructured data, maximizing visibility into organizational processes to drive decision making. The platform can extract lists, summaries, tables, and images that provide optimal answers to queries based on document-specific semantics.