Scientists Automate Core Box Image Recognition

Researchers from Skoltech have trained a neural network to recognize rock samples in core box images efficiently. The process has sped up analysis by up to 20 times and made it possible to automate the description of rock samples.

July 11, 2022

Data Science and Digital Engineering

One of the routine tasks of geological research is the description of rock samples. In many cases, the extracted rock core is stacked in boxes. Scientists take photographs of boxes or columns during the core study. The description is compiled manually by filling out spreadsheets or geological journals. The standard analysis procedure involves manual extraction of columns from photographs of boxes in a graphical editor. This is a rather time-consuming process.

To automate the process, scientists used machine learning. Traditional computer-vision algorithms, however, perform this task poorly because of the limited amount of data and large differences between images (e.g., if the core column differs in color or texture from adjacent ones or ones photographed in different conditions). Such differences significantly affect the performance of machine-learning algorithms, which require large data sets describing all possible variants. As a result, time must be spent to retrain the model.

To solve this problem, Skoltech scientists used deep convolutional neural networks—artificial neural networks that are similar in structure to the visual cortex of animals. To train the neural network, the scientists used augmentation that added modified copies of core boxes’ photos to increase the amount of data. Synthetic images were created based on a modified CutMix algorithm. The CutMix algorithm creates a new image from a pair of existing ones by randomly cutting out a piece of one image and inserting it into another. Because the scientists were interested specifically in recognizing rock columns, they optimized this method based on a core image template, cutting and swapping pieces only from the areas where the core was located.

Details of the method are described in a paper published in Computers & Geosciences.

“Core boxes photographed in the same field may be visually very similar, but the rocks may differ. If rock from another box is virtually placed in the same box, the network can confuse the core area with the box boundaries due to the similarity in color,” said lead author Evgeny Baraboshkin. “Augmentation helps the network to focus on other characteristics besides color and shape, such as structure and texture.”

In their study, the scientists described and tested the new method and compared the efficiency of the algorithm trained on the original data and mixed with augmented data. It turned out that, because of augmentation, the algorithm is trained to detect rock columns efficiently and accurately in most of the new images. This automated approach speeds up the processing of one core box by up to 20 times. In addition, the method made it possible to determine automatically the depths corresponding to each column. Previously, this required measuring with a ruler.

“Interestingly, when we added augmented data into the usual data set, the neural network learned to recognize pieces of paper with inscriptions on the columns, although, in the original data set, they were also labeled as core,” Baraboshkin said. “The algorithm detected an error in the initial markup and avoided it in the future.”

The scientists introduced the method as one of the stages of analysis into the DeepCore system, a software product they created for an automatic core description from images. After extracting columns from images, the program determines the layer boundaries and rock types. At the same time, users still have the possibility to expand. If necessary, an expert can add types of rock or change layer boundaries.