Data & Analytics

Fouling-Prediction Model Uses Machine Learning

The machine-learning techniques applied aim to deliver a prediction model based on both simulation and real-time field data. The model tracks and monitors system key performance indicators.

Offshore platform and supply ship, sea wave blue clear sky
Getty Images.

New water treatment facilities in the Gulf of Mexico include a seawater sulfate removal unit (SRU) to mitigate reservoir souring and scaling. Current industrial practice relies on only pressure drop and regular cleaning intervals to perform SRU maintenance, which may result in reduced membrane life because of cleaning frequency or severe membrane fouling without the capability to predict fouling based on process conditions. The machine-learning techniques applied in the complete paper aim to deliver a prediction model based on both simulation and real-time field data. The model tracks and monitors system key performance indicators (KPIs).

KPI Model Establishment

Before KPI models were established in this study, key model input and output parameters were identified. The input parameters were values that could be measured directly, including feed temperature, feed and permeate total dissolved solids (TDS), and feed and reject pressures. The output parameters were predicted membrane pressures, fouling factors, and permeate sulfate concentrations. For membrane fouling monitoring, the current industrial practice mostly focuses on differential pressure across membrane elements or pressure vessels. However, this practice overlooks the effect of other parameters on pressure drop such as temperature, which may lead to incorrect decisions. In this study, fouling factor was used instead to reflect an accurate picture of membrane fouling.

Fouling factor is a projection of membrane fouling over time, which typically has a value between 0 to 1. A fouling factor of 1 indicates that the aged membrane has the same fouling profile as new membrane, while a smaller fouling factor means more-severe fouling. Regarding the SRU process, the major fouling mechanisms causing flux reduction include particle plugging and biofouling. The fouling factor will cover all fouling mechanisms that may result in membrane permeate flux reduction.

The first step was to obtain SRU data for the prediction model. Because the SRU has not been installed, a membrane-simulation tool was applied to generate membrane-performance data under various operating conditions. A membrane-simulation tool was supplied by vendors and was governed by the solvent (water) and solute (dissolved inorganic ions) mass-transport equations.

The next step was to use simulation data for model selection, training, and validation. Several models were evaluated using training data and the model scores were compared. Metrics were used to determine the model that best fit the data employed; however, extensive validation and sensitivity analyses were used to understand the model capability of generalization.

Definitions provided in this synopsis are taken from sources documented in the complete paper.

  •  Model Score or Coefficient of Determination: “It represents the proportion of variance that has been explained by the independent variables in the model. It provides an indication of goodness of fit and therefore a measure of how well unseen samples are likely to be predicted by the model, through the proportion of explained variance.”
  •  Mean Square Error: “The mean_squared_error function computes mean square error, a risk metric corresponding to the expected value of the squared (quadratic) error or loss.”
  •  Mean Absolute Error: “The mean_absolute_error function computes mean absolute error, a risk metric corresponding to the expected value of the absolute error loss or -norm loss.”

The concept of machine learning was applied in this study to expand the data set and make the model more robust in order to adapt more variations.

  •  Supervised Learning: “Supervised learning, also known as supervised machine learning, is a subcategory of machine learning and artificial intelligence. It is defined by its use of labeled data sets to train algorithms that classify data or predict outcomes accurately.”
  •  Unsupervised Learning: “Unsupervised learning uses unlabeled data. From that data, it discovers patterns that help solve for clustering or association problems. This is particularly useful when subject matter experts are unsure of common properties within a data set.”
  •  Model Generalization: “Refers to your model’s ability to adapt properly to new, previously unseen data, drawn from the same distribution as the one used to create the model.”
  • Data Augmentation: “In data analysis, these are techniques used to increase the amount of data by adding slightly modified copies of already existing data or newly created synthetic data from existing data.”

Once the model is chosen and built, a separate data set (other than training data) was used to validate the model. After the SRU is online and producing real-time data, the real data will be used to adjust model parameters. Anomaly detection was developed to understand the deviation of membrane performance and screenout data outliers for improved decision making.

  • Novelty Detection With Local Outlier Factor: “The local outlier factor algorithm is an unsupervised anomaly-detection method which computes the local density deviation of a given data point with respect to its neighbors. It considers as outliers the samples that have a substantially lower density than their neighbors.”

The authors of the complete paper have identified the following opportunities in which these models can be used:

  • Real-time membrane monitoring: These models provide additional information that help the operator to make the right decisions in terms of frequency for clean in place (CIP), identifying rare events linked to membrane integrity or membrane performance, and backing up critical analyzers such as sulfate content analyzers.
  • Operator training simulator (OTS) model integration: Membrane models are vendor-specific; therefore, no off-the-shelf models exist that can be used on OTS simulation software. Thus, application of the models can help to develop models that are more-accurate and respond to membrane parameters instead of simulated differential pressure.

Modeling and Process Results

The authors used synthetic data from simulation, which provided significant advantages. First, the models could be developed before facility start-up; second, a broader operational range can be obtained on the data set. Data-augmentation techniques have proved effective in improving model performance and the ability to use more-complex model frameworks. Key model parameters were identified using feature engineering, and an instrumentation gap analysis was performed to determine if all instrumentation needed was included as part of the process design.

Figs. 1a and 1b show the effect of applying data-augmentation techniques. A substantial improvement in terms of generalization and accuracy of the model is evident. Additionally, data augmentation allows more-complex models to be used.

(a) Fouling factor without data augmentation; (b) fouling factor with data augmentation.
Fig. 1—(a) Fouling factor without data augmentation; (b) fouling factor with data augmentation.

Several models were tested, and extensive validation performed. Fouling-factor model performance for second-pass SRU was good, allowing a reliable prediction of target variables. Models had been developed using Scikit-learn and Python programming language.

The SRU prediction model was built successfully and validated through obtaining the best score from the different models evaluated. The model was further validated using a validation data set. The model used measurable process parameters (i.e., pressure, temperature, and conductivity) to predict process data that is more difficult to measure and was able to predict key operating parameters and the membrane-fouling profile.

This tool provides enhanced visibility to operators for membrane performance to support decision making. It also can serve as backup measurement for sulfate analysis, providing further insight into real-time data and reducing the risk of reservoir souring. One key advantage of the predictive model is to provide a much faster measurement than the analyzer.

Machine-learning models have demonstrated very good performance in predicting variables unable to be measured directly such as fouling factor. Defining a critical fouling factor for the operation of SRU membranes can be used to determine the optimal frequency for CIP, which affect membrane performance and life. Too-frequent CIP will result in permanent chemical damage to the membranes, while lack of CIP compromises membrane operation.

Anomaly detection is applied to SRU membranes to identify rare events. The authors have developed two models targeting performance and membrane integrity separately. Anomaly detection will alert the operator when membrane parameters are outside the baseline used for model training.

OTS Model Development

Machine-learning models for SRU membranes can predict KPIs that will enable the operator to create significant cost savings by optimizing membrane performance and extending membrane lifetime. Proper training based on membrane parameters will help the operator to make better decisions in the field.

As outlined in the complete paper, the results of using the OTS and retraining the fouling factor and permeate sulfate predictive models track very well with the real values.


The authors have developed an SRU membrane-performance model using synthetic data with machine-learning techniques. The model is being implemented as SRU installation and commissioning progresses. Conclusions of this study include the following:

  • Machine-learning models have proved effective in modeling sulfate-removal membranes.
  • Synthetic data can be used to develop models and advance and incorporate a broader range of operation for the model.
  • Data-augmentation techniques have had a positive effect on model accuracy, model generalization, and the ability to use more-complex models.
  • KPIs provide the operator augmented information that helps decision making in the field.
  • SRU models can be integrated successfully into the OTS, providing training benefits for the operator and testing environments for developed KPIs.

This article, written by JPT Technology Editor Chris Carpenter, contains highlights of paper SPE 206173, “Offshore Water-Treatment KPIs Using Machine-Learning Techniques,” by Lauren Flores, Martin Morles, and Cheng Chen, Chevron, prepared for the 2021 SPE Annual Technical Conference and Exhibition, originally scheduled to be held in Dubai, 21–23 September. The paper has not been peer reviewed.

Technical Paper Synopses in this Series

Introduction: Preparing Facility Engineers for 2022

Downhole Oil/Water Separation System Effective in Horizontal Wells

Mentoring, Sponsoring, and Networking 
Create Career Success

IoT With Cloud and Fog Computing 
Can Help Industry Recovery, Advancement

Fouling-Prediction Model Uses Machine Learning