AI/machine learning

Curiosity Meets Comprehension: How Energy LLM Is Bridging Knowledge Gaps

The energy-focused LLM project by Aramco Americas, SPE, and i2k Connect has entered the testing phase and is on track for licensing to operators later this year.

ELLM Hero image.png
Aramco, SPE, and i2k Connect leaders launched the ELLM at the Aramco Agora House during CERAWeek by S&P Global in Houston.
Source: Aramco

An energy-specific large language model (LLM) has entered the testing phase, bringing the democratization of industry knowledge one step closer to reality.

Aramco Americas, the Society of Petroleum Engineers (SPE), and i2k Connect have been working to preserve decades of experience, make it easier to search and understand, and remove the friction between curiosity and comprehension, Umar A. Al-Nahdi, petroleum engineering application services department director at Aramco, said during an 11 March event announcing the project has reached the testing phase.

“By leveraging generative AI (artificial intelligence) aligned with cutting-edge large language modeling and language processing, this model aims to make energy knowledge more accessible than ever before. It seeks to empower engineers and industry professionals to extract insights from data sets in seconds, breaking down barriers to learning and accelerating decision making,” he said during the event held at Aramco’s Agora House at CERAWeek by S&P Global.

The testing phase comes about a year and a half after the three organizations signed a memorandum of understanding (MOU) at SPE’s 2023 Annual Technical Conference and Exhibition to explore the development of an AI tool that could answer technical questions specific to the energy industry.

Al-Nahdi believes AI success “requires collaboration, not isolation” and is at the heart of the achievement.

“By partnering with SPE and leveraging the expertise of i2k Connect, we created a model that will benefit the entire energy sector, not just one company,” he said. “Together we are not launching a model. We are unlocking a new era of learning, inspiration, and progress for the energy sector.”

The ELLM Project

Aramco Americas financed the project, and SPE provided content for training from the OnePetro database, which is considered the definitive resource on the upstream oil and gas industry.

SPE manages the multisociety library and contributes about 50% of the technical content maintained in the library. Houston-based software company i2k Connect executed the training program to fine-tune a model, starting with Llama 3.3 LLM, into an energy-specific LLM. The resulting Energy Large Language Model (ELLM) has been donated to SPE, and Aramco Americas is licensed to use it.

John Boden, senior vice president of corporate development at i2k Connect, said during the event that various sizes of ELLM have been fine-tuned, from a small 1-billion to a 70-billion-parameter version. He noted that the 8-billion-parameter model, which can run on a laptop, and the 70-billion parameter version, which run on a graphics processing unit server, likely hit the sweet spot for applications in the energy industry.

2025 SPE President Olivier Houzé said during the event that SPE plans to license it to other energy companies, who could add their own content and further fine-tune the ELLM, resulting in ELLM+.

“This is something that has been on the fast track,” he said.

After getting the ball rolling on the project through the MOU in October 2023, the partners signed the contract in October 2024. By January 2025, ELLM v. 1.0 was released, with internal testing following.

“Now the testing is being extended to the (SPE) membership at large,” Houzé said.

Boden said the current testing opens the ELLM up to assessment by a broader audience to evaluate its answers to questions in terms of content and structure. SPE members can test the ELLM after logging into their SPE account and choosing one of eight disciplines to test. Testing pits the ELLM answers against the foundation Llama 3.3 model, he said.

Later this year, Houzé said, the 70-billion-parameter version is expected to be licensed to operators.

And while the ELLM is able to answer questions, it’s not yet providing source material, he noted.

“In order to deploy a solution that would be useful to our members, we will need to complement that with a retrieval-augmented generation (RAG) capability that would reference papers relevant to the question,” Houzé said.

He added that the energy RAG (ERAG) could be available either this year or next. Additionally, there’s a possibility of deploying an 8-billion-parameter version of the ELLM to SPE members in 2026 or 2027.

Al-Nahdi said, “This is just the beginning of the journey that will redefine how we interact with knowledge.”