Big-Data Analytics for Predictive Maintenance Modeling: Challenges and Opportunities

Optimization of maintenance costs is among operators’ main concerns in the search for operational efficiency, safety, and asset availability. The ability to predict critical failures emerges as a key factor, especially when reducing logistics costs is mandatory.

Businessman holding sign reading Maintenance
Getty Images

Big-data analytics can allow a better understanding of a production system’s abnormal behavior. This knowledge is essential for the adoption of a proactive maintenance approach, leading to a shift toward condition-based maintenance (CBM). CBM focuses on performing interventions on the basis of actual and future states of a system determined by monitoring underlying deterioration processes. One of the building blocks of CBM design and implementation is the prognostic approach, which aims to detect, classify, and predict critical failures. This paper presents approaches for constructing a prognostic system.


Optimization of maintenance costs is among operators’ main concerns in the search for operational efficiency, safety, and asset availability. The ability to predict critical failures emerges as a key factor, especially when reducing logistics costs is mandatory.

Experience has shown that significant benefits can be achieved when major maintenance interventions (overhauls, usually performed periodically) can be postponed on the basis of conclusions from the use of degradation models. Such an approach can be complex, but its results may reduce maintenance and logistics costs while keeping availability within required levels.

Traditional approaches to reliability estimations are based on the distribution of event records of a population of identical units. Many parametric failure modes, such as Poisson, exponential, Weibull, and log-normal distributions, have been used to model machine reliability. This project attempts to create an integrated solution using big data and analytics techniques to implement a CBM standard procedure for the target problem.

The V-Shaped Method

The International Organization for Standardization’s standards for condition monitoring and diagnostics of machines offer good guidance to establish a CBM standard procedure, especially for the target problem of turbogenerator failures in a floating production, storage, and offloading (FPSO) vessel.

Diagnostics can be described as a procedure of reasoning to interpret the health condition of machinery by use of data acquired during its operation. It has a vital role in decision making for both operation and maintenance. In addition, diagnostic procedures should be adjusted according to potential failures (on the basis of their likelihood and ­severity) that could occur in a machine. The principle is shown in Fig. 1. The V‑shaped array represents the high-level and low-level concerns.

Fig. 1: Condition monitoring (CM) and diagnostics (D) cycle.


Condition monitoring for offshore installations is certainly a challenge, especially when it comes to data quality and analysis. Having identified the critical functions, it would be possible to identify the critical components, failure modes, and degradation mechanisms.

Machine/Process. The system under study belongs to the main power-­generation system of an FPSO unit operating in the Campos Basin. It has four turbogenerators, each consisting of an aero­derivative gas turbine driving an electric generator. The main emphasis of the complete paper is the gas-turbine engines.

Components. From a maintenance perspective, it is of interest for the operator to have a functional tree representing the machine. This tree should be composed mostly of the maintainable parts, and, from the component breakdown, one should list all possible failure modes and their respective causes and degradation mechanisms.

Then, the criticality of each of the failure modes should be assessed through expert judgment or from historical data on the basis of significance and probability of occurrence.

Symptoms Modeling, Descriptors, and Measurements. In the modeling of symptoms, the operator must rely on the expertise within the organization with respect to a particular asset.

Descriptors can be obtained from condition-monitoring systems, either directly or after processing of the measurements. Descriptors have one big advantage over measurements: Their selectivity helps to increase the accuracy of the diagnostics significantly.

Data from sensors were stored, processed, and analyzed in order to identify correlations between parameters that explain the events best (eg, principal-component analysis); patterns of behavior related to major occurrences; if there were variables that should be included in the monitoring set ; and what more can be considered in the correlation of variables with their failure modes or critical components and subsystems.

In general, all collected data can be subdivided into two groups: (1) events, data that include information on what actually happened, what caused the event, and what was done, and (2) condition monitoring (CM), measurements related to the health state of the machine.

Typically, the event data collection requires manual data entry while CM data, nowadays, is collected automatically with the help of sensors.

Processing and Recognition. Data processing should be started with data filtration and cleaning because the collected data (especially those entered manually) may contain errors. The most common types of errors include those caused by the human factor and those caused by faulty or malfunctioning sensors.

The following step is data analysis. Several models, algorithms, and methods are available for data analysis, depending on the type of data collected.

Diagnosis and Prognosis. The final step in all CBM approaches is making decisions. The diagnostics of machine failures is basically a procedure of mapping the information obtained in the measurement space or features in the feature space to machine failures in the failure-mode space.

Prognostics is a complex task. In general, it is divided into two main types. The first includes a prediction of time until machine or component failure and is called “remaining useful life.” The second is used to predict the time that a machine could operate without failure.

Research Hypothesis: The Challenge

One can observe that most failures are related to machine startups. This kind of event is a hidden failure and is difficult to predict. For the main failure event considered in the complete paper, one critical component is a valve from which precise behavior is demanded during the startup.

Considering that the action (countermeasure) for those events is the replacement of the valve, a question was raised for the research team: “If the abnormal valve’s behavior could be detected during the run, could a predictive model be developed to assign a probability of failure at the next startup?”

Working with this notion, the team extracted sensor data from the industrial repository in order to train offline classifiers.

Classification Processes

Database Preprocessing. This step includes the removal of all major outliers and adjustment of the sampling frequency to a unique value (1 sample/min).

Event Annotation. In this stage, the time stamps of all normal stops (NSs) and machine failures (MFs) are determined. Removal of repeated NS and MF occurrences (in less than a given time interval) is also performed. For all validated NS or MF situations, a 24-hour interval is identified during which the machine operated without interruption before the stop that originated the associated event.

Feature Extraction. For all NS and MF events, the machine operation within the 24-hour interval identified in the preceding step is characterized by meaningful features that should act as the classifier input.

Classifier Training. Using the features extracted in the preceding stage, the chosen classifier is trained, following the event labels defined in the second step.

The result of this four-step procedure is a smart algorithm capable of identifying a faulty operation the next time the machine is started, on the basis of the features extracted during the 24 hours before the machine stop.

Final Considerations and Future Work

Starting from a challenging research proposition, such as to model and predict hidden failures, this study discusses the modeling of machine failures by the use of a big-data analytics approach with different classifiers.

In the search for a smart algorithm ­capable of identifying a faulty operation the next time the machine is started, on the basis of features extracted before the most recent machine stop, some classifiers were tested in a series of experiments. The results of those experiments are presented in the complete paper.

Among the problems encountered in this research were data-collection difficulties in terms of database standardization.

From the use of predictive models, once useful models are constructed in the near future, another problem that arises relates to the decision making, in which the process must include the model’s predictions. In that sense, future work will consider more than one model, resulting in a voting system that would be able to provide reasoning for ­decisions.

This article, written by Special Publications Editor Adam Wilson, contains highlights of paper OTC 26275, “Big Data Analytics for Predictive Maintenance Modeling: Challenges and Opportunities,” by I.H.F. Santos, M.M. Machado, E.E. Russo, D.M. Manguinho, V.T. Almeida, R.C. Wo, M. Bahia, and D.J.S. Constantino, Petrobras; D. Salomone, M.L. Pesce, C. Souza, and A.C. Oliveira, EMC—Brazil Research Center; and A. Lima, J. Gois, L.G. Tavares, T. Prego, S. Netto, and E. Silva, PEE-COPPE/UFRJ, prepared for the 2015 Offshore Technology Conference Brasil, Rio de Janeiro, 27–29 October. The paper has not been peer reviewed. Copyright 2015 Offshore Technology Conference. Reproduced by permission.