AI/machine learning

The SME Paradox: Where Do Experts Come From When AI Compresses the Apprenticeship?

This commentary by the chair of the SPE Data Science and Engineering Analytics Technical Section examines how AI is reshaping petroleum engineering careers, highlighting growing risks to entry‑level training, judgment development, and the future pipeline of subject-matter experts in high‑consequence industries.

AI or Artificial intelligence concept. Businessman using computer use ai to help business and used in daily life, Digital Transformation, Internet of Things, Artificial intelligence brain, A.I.,
The answer is not anti-AI. It is more demanding than that. As AI makes technical work faster, broader, and more accessible, the differentiating skill in petroleum engineering will shift upward.
Source: Userba011d64_201/Getty Images.

I have been struggling with a question since the advent of generative AI and more specifically agentic AI. I see the potential of AI, especially agentic AI, to automate repetitive or boring tasks and humans (qualified humans) to take actions based on the output. I really believe in this possibility and to some extent that human work will evolve with this. However, in some way or form these repetitive or boring tasks are useful in training the next generation of engineers for the real world; for example, running a reservoir simulation, history matching, or decline curve analysis teach young engineers how to spot problems, spurious or missing data, and make judgments with less data.

I started with a simple question and ended up somewhere I did not expect: If AI compresses the entry-level work that is used to train engineers, and if our industry still needs subject-matter experts (SMEs) to challenge, validate, and ultimately own high-consequence decisions, where do those experts come from 10 years from now?

That question matters because upstream oil and gas is being squeezed from both ends. At the top, the industry continues to face an aging workforce, renewed competition for experienced talent, and persistent concern about knowledge transfer. At the bottom, broader labor-market evidence suggests AI is beginning to hit early-career work first in the most exposed occupations as reported by the World Economic Forum’s Future of Jobs Report 2025. Those studies are not specific to petroleum engineering, but the implication is hard to ignore: if apprenticeship work is compressed at the same time expertise is thinning, the pipeline to future SMEs does not merely get narrower. It gets structurally weaker.

That is the paradox.

We keep saying we need humans in the loop. But that answer is incomplete. The harder questions are what kind of humans, developed through what sequence of experiences, and will they still be capable of being in the loop when AI does more and more of the technical work.

The Double Squeeze

The first squeeze is demographic. Recent reporting in the energy sector continues to point to a familiar problem: an aging workforce, workforce mobility, and a renewed struggle to attract and retain the next generation of technical talent. This is not a new concern, but it is becoming more consequential precisely when technical work itself is starting to change.

The second squeeze is technological. The World Economic Forum’s Future of Jobs Report 2025 found that 40% of employers expect workforce reductions where AI can automate tasks, even as they also expect large-scale reskilling needs. Stanford Digital Economy Lab researchers found that employment declines since late 2022 have been concentrated among workers ages 22 to 25 in the most AI-exposed occupations.

Again, this is not direct evidence that petroleum engineering entry roles are disappearing at the same rate or in the same pattern. But it is strong enough to force the question. If AI compresses the lower end of the technical ladder, and if the upper end of the ladder is already under pressure, then the industry cannot assume expertise will reproduce itself the way it did before.

That assumption is where the real risk begins.

The Recursion: What Do Humans Actually Do?

The standard response is simple: “We need humans in the loop.” This is true, but doing what, exactly? One useful way to think about the shift is as a five-layer progression. This is not a rigid maturity model. It is a way to describe how human value changes as AI moves deeper into technical workflows.

Layer 1: Humans tell what to do. AI does the work.

This is where many organizations already are or are trying to get to. AI-assisted tools can accelerate technical analysis, draft technical plans, summarize subsurface information, speed up model setup, and generate plausible recommendations across workflows by running multiple tools. Humans still tell AI what to do and only the task execution is automated.

Layer 2: AI generates analytical output. Humans learn to check AI instead of merely using it.

This is the critical layer, and it is where the future either holds or breaks. AI generates automatic recommendations and outputs. Engineers validate outputs against physics, heuristics, analogues, operating constraints, and field reality. They ask whether the answer violates bounds, whether the assumptions actually hold, whether the confidence is earned or just well formatted. This is where the habit of challenge forms or more importantly, does not.

Layer 3: AI flags that cannot reliably validate.

As systems mature, they will likely become better at exposing their own blind spots: missing analogues, unconstrained assumptions, out-of-distribution conditions, weak boundary conditions, hidden uncertainty. But that only matters if humans have been trained not to sleepwalk past warnings. However, according to research, AI should not review its own work. In that scenario, multi-model workflows will emerge—one AI provides recommendations, other AIsreview it.

Layer 4: Humans decide under uncertainty.

At this point the work is technically strong, much of the checking is done, and the uncertainty is visible. The remaining task is judgment. Do we proceed? Delay? Spend more? Redesign? Accept the risk? No model can fully absorb that burden, because it is not just technical. It is organizational, financial, and moral.

Layer 5: Humans bear consequences.

This is the floor that does not move. In real life, someone still owns the reserves number, the design basis, the operating envelope, the capital decision. AI can inform. It can accelerate. It can even warn. But it does not bear the consequence of being wrong.

The important point is not whether these layers unfold neatly in sequence. They will not. The important point is that if layer two is weak, everything above it becomes fragile. If engineers lose the reflex to question, then better tools do not create better judgment. They simply create faster confidence.

The Cognitive Surrender Problem

That is why this is bigger than a staffing issue. The real danger is not only that AI changes the pipeline. The danger is that the humans who remain lose the very habit that makes them worth keeping.

Microsoft Research has framed this as a problem of appropriate reliance: people need to accept correct AI output and reject incorrect output, yet overreliance is common and often hard to detect. Microsoft Research on knowledge workers’ use of generative AI suggests that cognitive effort is shifting away from direct production and toward oversight, verification, and judgment. At the same time, recent studies including, a 2025 study published by Carnegie Mellon University, continue to show that large language models can be overconfident when wrong and can present hallucinated outputs with high certainty.

Screenshot 2026-05-05 123505.png
Source: Figure created by the author.

Put those together and the failure mode becomes obvious: overconfident systems meet increasingly uncritical users.

In upstream settings, that failure may not arrive as a spectacular single collapse. More often it will arrive quietly. A forecast looks reasonable. A model output is internally clean. A recommendation is accepted because it is fast, polished, and usually right. Then the hidden flaw reveals itself later, when the waterflood underperforms, the operating window tightens, the contingency was never truly bounded, or the capital has already been committed.

The most dangerous thing about this pattern is that it trains itself. Every time an engineer accepts an AI output without really interrogating it, the threshold for questioning the next one gets a little higher. The discipline of challenge atrophies by repetition. That is not a software problem. That is a developmental problem.

The Compression Paradox

AI is compressing time to technical productivity. It can help younger engineers move faster, work broader, set up analyses sooner, and access methods that once required much more experience to apply. That is real, and it is valuable.

But it does not compress time to judgment in the same way.

You can speed up technical execution. You cannot fully speed up the experience of owning ambiguous decisions with incomplete information, conflicting incentives, operating consequences, and real accountability. You cannot shortcut the process of learning when a clean answer is still the wrong answer. You cannot automate the internal calibration that comes from being right sometimes, wrong sometimes, and accountable every time.

That is the paradox. The industry may soon produce technically productive engineers earlier than before, while still failing to produce judgment-ready engineers on the same timeline.

For decades, technical repetition and judgment development were coupled. AI may break that coupling. If it does, then the core leadership task is no longer just technical training. It is the deliberate design of consequence-bearing experience.

Beyond the T-Shape: Response Surface Careers

Most workforce conversations still default to “T-shaped skills:” depth in one area, breadth across others.

That idea still has value, but AI has weakened its explanatory power. Breadth is getting cheaper. A capable engineer with good tools can move across reservoir, drilling, facilities, economics, and data much more fluidly than before. The scarce thing is no longer just what you know. It is what kinds of situations you have personally navigated and owned.

This does not mean specialization stops mattering. David Epstein’s Range makes a useful distinction here: in complex, uncertain environments, broad exposure often improves pattern recognition, transfer, and adaptability. But that breadth is most powerful when it sits on top of real depth. The problem is not specialization itself. The problem is mistaking narrow repetition for sufficient development in a world where AI increasingly makes narrow technical execution easier.

A better metaphor may be one petroleum engineers already understand: response surface methodology.

In experimental design, a response surface methodology maps behavior across a multidimensional space. Where you have data, you can interpolate. Where you do not, you are extrapolating.

A career works the same way. Every meaningful, high-consequence decision you have personally owned is a data point on your surface. Over time, that surface gets built across at least four dimensions.

1. Domain diversity: the range of technical contexts you have actually worked across
2. Stake magnitude: the size of the consequences attached to your decisions
3. Ambiguity level: how incomplete, novel, or uncertain the situation was
4. Human complexity: how many partners, disciplines, regulators, and conflicting interests were involved

A T-shape measures what you can reach. A response surface measures where your judgment is actually grounded.

The goal is not to replace specialization with generalism. It is to develop specialists whose judgment has been widened by range. And like any engineer knows, diverse data points matter more than repeated points from the same narrow corner of the design space.

To make this framework practical rather than purely conceptual, I built a free public tool that analyzes a resume or exported LinkedIn profile through these same dimensions and generates a career-surface report. It is not meant to define a career with false precision. It is meant to make the pattern visible: where your experience is broad, where it is narrow, and where judgment may be grounded more in repetition than in true consequence range. The tool is available at career.pushpesh.ai. To get best results, paste your resume in the box or upload your current resume as a PDF, or upload your LinkedIn profile saved as a PDF. LinkedIn on your desktop browser allows you to save your profile as a PDF.

Simulated Surfaces vs. Experimental Surfaces

But even the four axes listed above miss something important.

Every engineer understands the difference between a response surface built from simulated data and one built from actual experiments. Simulated surfaces are smooth, coherent, and elegant. Experimental surfaces are messier, but they contain the artifacts, noise, interactions, and boundary conditions that the model never fully captured.

Careers are similar. An engineer with field and operations exposure builds an experimental surface. They have seen not only the model, but the exception. Not only the procedure, but the workaround. Not only the recommendation, but the hesitation before execution. They know what the system feels like when reality starts to drift away from the neat representation on the screen. That matters because physical experience creates resistance to false confidence. It gives engineers something to push against when AI output looks polished but feels incomplete.

An engineer whose experience is mostly screen-based can still be broad, smart, and technically excellent. But if their surface is built mostly from representations rather than operations, it may be smoother than reality. And smoother than reality is exactly how organizations get surprised.

Field experience is not just an early-career checkbox. It is part of what makes later judgment trustworthy.

What Students and Professionals Should Do Now

1. Do not confuse AI-assisted speed with real readiness. Use AI aggressively to learn faster, but do not outsource the struggle that builds judgment. When AI gives you an answer, force yourself to ask what assumptions it made, what data it may be missing, and what field reality could invalidate it.

2. Build your response surface deliberately, not accidentally. Do not optimize only for the next promotion or the most convenient role. Seek experiences that widen at least one of the four dimensions: domain diversity, stake magnitude, ambiguity level, or human complexity. If every role looks similar, your surface may be getting deeper but not wider.

3. Get closer to reality than the screen. If you are early in your career, field and operations exposure is not optional seasoning. It is part of what makes later judgment trustworthy. Volunteer for startup support, operations reviews, troubleshooting, site visits, or commissioning work. Build a physical reference frame before AI makes screen-based confidence too easy.

4. Keep one area of real depth. Breadth matters more in an AI-enabled world, but depth still matters. You need at least one domain where your understanding goes below the interface, below the workflow, and below the recommendation. That depth gives you the confidence to challenge what looks polished but is wrong.

What Managers/People Leaders Should Do Now

1. Map your team’s response surface. Where is your organization’s judgment actually grounded? In how many basins, asset types, operating modes, regulatory environments, and consequence settings have your people carried real responsibility? Where are the blind spots? How many of your strongest technical people have meaningful field exposure rather than only screen exposure?

2. Track decision quality, not just outcomes. Engineering organizations need a disciplined way to review high-consequence decisions, especially where AI-assisted analysis was involved. The question is not only whether the outcome was good. The question is whether the reasoning was sound, whether the assumptions were challenged, whether uncertainty was recognized, and whether someone caught what the model missed.

3. Rotate your best people deliberately. Not just across functions, but across realities. Send screen-based experts into operations. Give data scientists exposure to field context. Put engineers closer to the consequences of the assumptions they make. Judgment grows where abstraction meets reality.

The Real Skill of 2036

We started with a technical question: Where do SMEs come from if AI compresses the apprenticeship?

The answer is not anti-AI. It demands more than that. As AI makes technical work faster, broader, and more accessible, the differentiating skill in petroleum engineering will shift upward. The value will not lie in producing answers quickly, but in knowing when that answer is wrong, incomplete, misapplied, or not yet safe to act on.

That is the skill AI does not make free. The real skill of 2036 is not just technical fluency, but accountable judgment built through operational exposure, ambiguity, consequence, and the repeated discipline of challenge.

The question for leaders is simple. Are you building organizations that widen your people’s response surfaces and anchor them in reality? Or are you optimizing for the very skills AI is in the process of commoditizing, while the one capability that keeps engineers irreplaceable quietly atrophies every time they click “accept” without really reading?