Inherent Flaws in Risk Matrices May Preclude Them From Being Best Practices
This paper illustrates and discusses inherent flaws in RMs and their potential effect on risk prioritization and mitigation, addressing several previously undocumented RM flaws.
Risk matrices (RMs) are among the more commonly used tools for risk prioritization and management in the oil and gas industry. RMs are recommended by several influential standardization bodies, and a literature search found more than 100 papers that document the application of RMs in a risk-management context. This paper illustrates and discusses inherent flaws in RMs and their potential effect on risk prioritization and mitigation, addressing several previously undocumented RM flaws.
In the oil and gas industry, risk-intensive decisions are made daily. In their attempt to implement a sound and effective risk-management culture, many companies use RMs and specify this in “best practice” documents. Furthermore, RMs are recommended in numerous international and national standards such as those from the International Organization for Standardization (ISO); NORSOK, the Norwegian standards organization; and the American Petroleum Institute (API). The popularity of RMs has been attributed in part to their visual appeal, which is claimed to improve communications.
Despite these claimed advantages, the authors were unable to find instances of published scientific studies demonstrating that RMs improve risk-management decisions. However, several studies indicate the opposite—that RMs are conceptually and fundamentally flawed.
The complete paper summarizes the known flaws of RMs, identifies several previously undiscussed problems with RMs, and illustrates that these shortcomings can be seen in SPE papers that either demonstrate or recommend the use of RMs.
An RM is a graphical presentation of the likelihood, or probability, of an outcome and the consequence should that outcome occur. Consequences are often defined in monetary terms. RMs, as their name implies, tend to be focused on outcomes that could result in a loss rather than a gain. The purported objective of the RM is to prioritize risks and risk-mitigation actions.
Within the context of RMs, “risk” is defined as consequence multiplied by its probability, which yields the expected downside consequence or the expected loss.
The consequences and probabilities in an RM are expressed as a range. For example, the first consequence (loss) category might be <USD 100,000, the second might be USD 100,000–250,000, and so on. The first probability range might be ≤1%, the second might be between 1 and 5%, and so on. A verbal label and a score are also assigned to each range. (Some RMs use these instead of a quantitative range.) For example, probabilities from 10 to 20% might be labeled as “seldom” and assigned a score of 4. Probabilities greater than 40% might be termed “likely” and given a score of 6. Consequences (losses) from USD 5 million to 20 million might be termed “severe” and given a score of 5; losses above USD 20 million might be labeled as “catastrophic” and given a score of 6.
Such an RM would treat losses of USD 50 billion (on the scale of BP’s losses stemming from the Macondo blowout) or USD 20 million in the same way, despite the difference of three orders of magnitude. Because there is no scientific method of designing the ranges used in an RM, many practitioners simply use the ranges specified in their company’s best-practice documents.
The cells in RMs are generally colored green, yellow, and red. Green means “acceptable.” Yellow stands for “monitor, reduce if possible.” Red is “unacceptable, mitigation required.” Previous work has detailed the way in which the colors must be assigned if one seeks consistency in the ranking of risks. Most of the papers examined failed to assign colors in a logically consistent way. For example, some of the cells designated as red were “less risky” than some of the cells that were designated as yellow.
Current Industry Practices
In order to use the RM for risk prioritization and communication, several steps must be carried out.
Step 1—Define Risk Criteria. This step determines the size of the RM and its number of colors. Although there is no technical reason for it, RMs are generally square. The most common size is five rows by five columns (i.e., a 5×5 matrix), but some companies use a 3×3 matrix and others an 8×8 matrix. Some companies choose to include more colors than the standard red, yellow, and green in their RMs.
Step 2—Define Risk Events. This step identifies the risk events. For example, drilling a particular hole section is an event for which one could identify all the possible downside outcomes.
Step 3—Consequence Estimation and Probability Assessment. This step estimates the consequence range of each outcome identified in Step 2 and assigns probabilities to each outcome. For example, the outcome of severe losses is registered, and the expected financial consequence is estimated to be from USD 1 million to 5 million. The chance of this occurring is estimated to be 40%.
Step 4—Risk Profile. This step positions each identified downside outcome in a cell in the RM.
Step 5—Rank and Prioritize. This step ranks and prioritizes the outcomes according to their risk score. Most companies use a risk-management policy in which all outcomes in the red area are unacceptable and thus must be mitigated.
The results of Steps 2 through 5 are often collectively called a “risk register,” and the information required is usually collected in a joint meeting with the key stakeholders from the operating company, service companies, partners, and others.
Among the standards that are commonly used in the oil and gas industry are those of API, NORSOK, and ISO. All of these standards recommend RMs as an element of risk management.
API. API recommends RMs customarily for its risk-based inspection technology. Risk-based inspection is a method to optimize inspection planning by generating a risk ranking for equipment and processes and, thus, prioritization for inspection of the right equipment at the right time. API RP 581 specifies how to calculate the likelihoods and consequences to be used in the RMs.
NORSOK. The NORSOK standards were developed by the Norwegian petroleum industry to “ensure adequate safety, value adding, and cost effectiveness for petroleum industry developments and operations. Furthermore, NORSOK standards are as far as possible intended to replace oil company specifications and serve as references in the authority’s regulations.” NORSOK recommends the use of RMs for most of their risk-analysis illustrations.
ISO. The ISO standards influence risk-management practices not only in the oil and gas industry but in many -others. In ISO 31000, the RM is known as a probability/consequence matrix. Also in ISO 31000 is a table that summarizes the applicability of tools used for risk assessment.
Deficiencies of RMs
Several flaws are inherent to RMs. Some of them can be corrected, while others seem more problematic.
To locate papers that address or demonstrate the use of RMs, the authors searched the OnePetro database with the terms “risk matrix” and “risk matrices.” This returned 527 papers. Then, 120 papers published before 2000 were removed to make sure the study is focused on current practice. Of the remaining 407 papers, those that promote the use of RMs as a best practice and actually demonstrate RMs in the paper were selected, leaving 68 papers. Papers that presented the same example were then eliminated. In total, 30 papers were considered, covering a variety of practice areas (e.g., health/safety/environment, hazard analysis, inspection).
Known Deficiencies of RMs
For a discussion of known deficiencies of RMs—risk-acceptance inconsistency, range compression, centering bias, and category-definition bias—please see the complete paper.
Identification of Previously Unrecognized Deficiencies
This section discusses three RM flaws that had not been identified previously.
Ranking Is Arbitrary. Ranking Reversal. Lacking standards for how to use scores in RMs, two common practices have evolved: ascending scores, in which a higher score indicates a higher probability or more serious consequence, and descending scores, in which a lower score indicates a higher probability or more serious consequence.
Both ascending and descending scoring systems have been cited in the literature. In the 30 papers surveyed, five use the descending scoring system and the rest use ascending. This behavior demonstrates that RM rankings are arbitrary; whether something is ranked first or last, for example, depends on whether one creates an increasing or a decreasing scale.
Instability Because of Categorization. RMs categorize consequence and probability values, yet there are no well-established rules for how to conduct the categorization. One author recommended testing different categories because no single category breakdown is suitable for every consequence variable and probability within a given situation.
Following this recommendation, the authors attempted to find the best categories for an RM by examining the sensitivity of the risk ranking to changes in category definitions. To ease this analysis, they introduced a multiplier n that determines the range for each category. Ranges were retained for the first category for both consequence and probability. For the categories that are not at the endpoints of the axes, n will determine the start value and end value of the range. The multiplier can be varied to observe the effect on risk ranking for both ascending and descending scores.
Except where consequence is in ascending order, the risk prioritization is a function of n. This is problematic because the resulting risk ranking is unstable in the sense that a small change in the choice of ranges can lead to a large change in risk prioritization. Thus, it is shown again that the guidance provided by RMs is arbitrary.
Relative Distance Is Distorted. Lie Factor. The difference in how risk is portrayed in the RM vs. the expected values can be quantified by use of the lie factor (LF).
The LF was devised to describe graphical representations of data that deviate from the principle that “the representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the quantities represented.” This maxim seems intuitive, yet it is difficult to apply to data that follow an exponential relationship, for example. Such cases often use log plots, in which the same transformation is applied to all the data. However, RMs distort the information they convey at different rates within the same graphic.
None of the 30 papers reviewed included enough quantitative information for the LF to be calculated. The authors defined the LF for an RM as the average of the LFs for all categories. An alternative might be to define it as the maximum LF for any category.
Many proponents of RMs extol their visual appeal and resulting alignment and clarity in understanding and communication. However, the commonly used scoring system distorts the scales and removes the proportionality in the input data. How can it be argued that a method that distorts the information underlying an engineering decision in nonuniform and uncontrolled ways is an industry best practice? The burden is squarely on the shoulders of those who would recommend the use of such methods to prove that these obvious inconsistencies do not impair decision making, rather than improve it, as is often claimed.
This article, written by Special Publications Editor Adam Wilson, contains highlights of paper SPE 166269, “The Risk of Using Risk Matrices,” by Philip Thomas, SPE, and Reidar B. Bratvold, SPE, University of Stavanger, and J. Eric Bickel, SPE, The University of Texas at Austin, prepared for the 2013 SPE Annual Technical Conference and Exhibition, New Orleans, 30 September–2 October. The paper was peer reviewed and published in the February 2014 Oil and Gas Facilities, p. 56.