Better Safety Performance Measures Can Lead to Change by Improving Conversations

This paper examines the use of injury rates as a key performance indicator (KPI). It argues that, as a KPI, injury-frequency rate is no longer a valid measure.</p>

Fig. 1—The Swiss Cheese model.

For the last 40 years, the oil and gas industry has measured safety performance using injury-frequency rates. Industry thinking is based on the premise that, if we do not have injuries, then we are safe and, if we have injuries, we are not safe. This paper examines the fallacy of that premise and the use of injury rates as a key performance indicator (KPI). It argues that, as a KPI, injury-frequency rate is no longer a valid measure.

The Current Situation

As a KPI, injury-frequency rate has served the industry well. It has driven ownership of safety performance as a line responsibility, allowed senior executives to hold managers accountable for performance, forced leaders to notice injuries, and driven many improvements.

A graph showing performance over a 2-year period would be discussed at management meetings, reasons argued, and actions given to business unit leaders. The data could create a discussion along the lines of “Overall performance is clearly going in the wrong direction. We all need to be concerned.” Pointing to one cause would be difficult, and many theories would be put forward on the basis of this data.

Why Measuring Injury Rates Is Misleading

Research into accident causation has revealed much in the last 30 years. Earlier work resulted in the Swiss Cheese Model (Fig. 1 above), the Generic Error Modeling System (GEMS) (Fig. 2), and the Tripod Model of Accident Causation (Fig. 3). Two software-based products have been produced from these models; Tripod Beta and Bow Tie analysis both are now mainstream.

Fig. 2—GEMS model.


Fig. 3—The Tripod Model of Accident Causation.


Safety leaders no longer think that people are the only cause of accidents (i.e., stupid people doing stupid things). They understand that errors and violations are the product of systemic causes. Accidents happen because barriers fail. Barriers fail because of people’s action or inaction. People are generally trying to do a good job, but they are influenced by their environment. That working environment is created by the way the business is managed.

Accidents are complex events with multiple causes. Controls that fail can be a long distance from, and not related to, individuals who are injured. Normally, more than one control needs to fail before someone is injured. Often, those controls are put in place by different people at different times—an operator isolates equipment, a supervisor checks the isolation, and a technician works on the equipment.

Challenging the premise that the presence or absence of injury is a measure of safety reveals the different view that a truer measure of safety is the presence or reliability of the barriers (i.e., organizations with the same level of safety would have the same reliability of barriers). Clearly, this is an oversimplification, but, if the presence or absence of injury is a reliable way to measure safety, then business units that have the same reliability of barriers should have similar injury rates.

Comparing Simulations With the Real World

Injury-rate simulation models are very simple and do not take into account different reporting cultures or different risk levels. Nonetheless, they do demonstrate that, if all things are equal, the random effect indicated by the Swiss Cheese Model gives considerably different injury rates when only a few data points (typically less than 50 injuries) exist.

For instance, a UK business unit recently had one lost-time injury (LTI) in a year while completing 7 million man-hours, putting it in the top 25% of its peer group when benchmarking. The next year, it had 10 LTIs and was in the bottom 25% of its peer group in benchmarking exercises. Was it really 10 times more dangerous?

That personal-injury-frequency rates are no longer a good indicator of overall safety performance should be a cause for celebration rather than concern. In the early 1900s, fatality rates were an indicator of safety; later, injury rates were used. The challenge is to enable ­decision-makers to see the limitations of lagging indicators.

How Injury Rates Create the Wrong Type of Conversations and Behaviors

Many examples exist of a run of injury-free days ending with a spate of injures. For example, one organization had a truly dedicated managing director who was committed to safety. The company had achieved 3.4 million man-hours without an LTI, which, at the time, was a world-class achievement. One day, a worker dropped a heavy gate bolt on his foot on the wrong side of the steel toecap. It bruised his foot. His foot was still sore the next day, so he went to see the company doctor. The doctor told him to rest for 2 days while the bruising went down. This ended the world-class LTI achievement. The director was furious, and he called the chief medical officer, demanding, “Which of your stupid doctors gave him the time off?” The doctor said, in all honesty, “I thought he injured himself at home. I would have sent him back to work if I had realized it was going to be an LTI.”

This is just one example of how using injury rates can drive the wrong behavior from even the most committed and dedicated leaders.

What To Use Instead of Injury Rates

If injury rates can no longer be used, then what should be used? Many people will suggest observation rates, near-miss rates, and first-aid cases, but all of these measure outcomes. Safety is the outcome of a well-managed process; therefore, companies should look to measure the processes rather than the outcomes.

Research into accident causation provides a window of opportunity; both the Tripod Accident Causation and the Bow Tie models provide a wealth of options to measure indicators that will create positive and valuable conversations at any level in the organization.

Measuring Other Events. Many organizations have tried to measure near-misses or incident potential. Both have value, but both can also have unintended consequences when used as KPIs. Setting targets for near-miss reporting in one organization encouraged more reporting but resulted in a competition between sites to enter the most reports in the system. The workers saw this as a worthless game, while the managers felt they were changing the culture. KPIs for event reporting can have value, but what is measured and how leaders discuss the KPI will drive the culture—and not always in the right direction.

Measuring Barrier Reliability. During a rebuild of Reading rail station in the UK, construction leaders around a table identified two to five controls for each major risk that they considered critical. They then set up an inspection system and reviewed barrier-reliability data as a positive KPI in management meetings. It changed the type of conversation around that table.

Measuring the Type of Human Failure. Moving back down the accident-causation chain, one could identify the type of human failure responsible and classify that against the GEMS model of slips, lapses, and types of violations. That could be recorded in the reporting system and used to drive the conversation in management meetings. Imagine a management conversation that revolved around the idea that most process-safety incidents have slips and lapses rather than violations of procedure as the immediate causes of barrier failure.

Measuring Activities To Maintain Barriers. Bow Tie analysis provides a pictorial view of controls and the activities necessary to keep those controls in place. This provides many opportunities to develop KPIs that measure the process rather than the outcome, such as whether safety-critical maintenance has been performed on time and the number of audits performed. Examples of this can be found in process-safety proactive measuring. For instance, one organization measured safety-critical maintenance inspections as part of its integrity program. Three classifications were used: overdue inspections (due but not completed), deferred inspections (due but then deferred following an assessment), and completed inspections. The initial response was a significant transfer from overdue inspections to deferred inspections. At first, the conversations that the indicator generated were about the validity of the indicator and not the structural issues that were preventing maintenance from being conducted. This example demonstrates the importance of engaging the key stakeholders before introducing an indicator to drive positive conversations.

Measuring Management Behaviors and Activities. During the UK railway upgrade mentioned earlier, a bonus structure was implemented on the basis of injury rates. In the first 2 years, the bonuses were never achieved. The bonus structure was changed to consider measurement of two critical management activities, safety tours and personal leadership of incident reviews. The project saw a drop in injuries, from approximately 80 per year to approximately 20, and that stayed low for the remaining 5 years of the project, well below the targets set in the initial bonus structure. Did the changes made cause the decrease, or was it coincidence? No one can prove it either way, but the tone of the conversations between project managers and contractors around safety changed from “you better not have any more injuries” to “what is happening here and what are the people on the ground saying?” The conversations were deeper and more constructive, cooperative, and thoughtful.

This article, written by Special Publications Editor Adam Wilson, contains highlights of paper SPE 190663, “Building Better Performance Measures for Better Conversations To Provoke Change,” by A.D. Gower-Jones, W.T. Peuscher, J. Groeneweg, SPE, S. King, and M. Taylor, Tripod Foundation, prepared for the 2018 SPE International Conference on Health, Safety, Security, Environment, and Social Responsibility, Abu Dhabi, 16–18 April. The paper has not been peer reviewed.