Blog Post

Common Misconceptions and Pitfalls in Alarm Management

Katherine Persac, ProSys

To go along with an article we contributed to in this month’s Control Global Magazine’s Control Talk column, I thought I would elaborate on some common alarm management mistakes that people make. There is a lot of pressure to “do something” about alarm management – from your boss, OSHA, corporate edicts and maybe just because it’s a good idea. There are a lot of misconceptions and pitfalls that people fall into because if this pressure. I borrowed a little info from presentations and papers written by Les Jensen and Darwin Logerot, two alarm management experts that I have the privilege to know and call colleagues.

1. Making alarm management reduction the goal

There is an increasingly popular notion that the most practical and acceptable solution to the problems of alarms is to minimize the number of them. This approach most likely has its genesis in the recognition of undesirable alarm floods during process upsets. The full implementation of this solution is indeed simple – zero alarms. This may just be a poor choice of words, since no one can believe that such a solution is valid. However, the mindset is potentially damaging.

Realistically, the optimization of alarms on most existing DCSs will indeed result in fewer configured alarms. But there is a significant difference between a result and an objective. Alarm management is not a question of how many alarms are configured. It is a question of the quality of the active alarms. The quality of an alarm is a measure of its relevance. If an alarm is not relevant, then the quality is negative. It may be possible to make the alarm more relevant through dynamic management techniques. Otherwise, the alarm could be a candidate to be eliminated, at least for the process state or state transition being analyzed. But any relevant alarm must never be eliminated in an effort to reduce the number of configured alarms.

An optimal alarm management solution requires an acknowledgement of the dynamic nature of the problem and an understanding of what an alarm should be. A proper alarm management process must lead to alarms that are relevant to the current situation.

All too often, the results of the alarm management process are reported and evaluated in terms of valueless statistics. There are innumerable examples similar to “50% of the alarms were eliminated through the alarm management design process.” But what is one to infer from such a statistic? Is it appropriate to conclude that the next process incident, whatever it may be, will result in an alarm flood only half as large as would otherwise be expected? Even if true, would it really be a significant improvement for an incident related alarm flood to be reduced from 500 to 250 alarms?

2. Using “check the box” mentality in project execution

Alarm management is a “top down” requirement. Okay, I’m done. I have a software that reports my metrics and a database to keep all of my alarm data in. We did a rationalization a few years ago, I can use that data or the HAZOP or LOPA data. Next project please!

See the rest of the items in this blog!

3. Not realizing the operator is the customer

Having operator involvement in the philosophy and rationalization is key. The person that best knows how that unit runs is the board operator who has been running it for so long. Operators need to be involved from the very first meeting and have input into your philosophy document. And remember, your target audience is the operator and not upper management.

4. No consequences for an alarm

Way too often, alarms are configured for conditions which have no consequence if ignored. Alarms must require an action and have a consequence or they shouldn’t be an alarm.

5. Alarming normal events or status messages

One of the greatest errors of point-based alarm management is failing to recognize the distinction between alarms and status information. An alarm, by definition, reqiores an operator to prevent a negative consequence. Status information is the antithesis of an alarm. This is true whether the information is analog or digital in nature. Issuing status information as alarms clutters the operator’s alarm interface, distracts his attention, and predisposes him to ignore alarms. This criticism is also valid for many messages.

The alarm interface should be restricted to alarms. The objective is to use the alarm interface as an action item list. An alternative mechanism must be found to present status information. Since status information is characteristically something that is viewed on a demand basis, a schematic display is probably the most appropriate mechanism.

6. Multiple alarms for a single event

It is best to only provide one alarm for any single event. Otherwise, this can lead to confusion about which alarm to take care of first and operators waste valuable time silencing the other alarms.

7. Alarm messages not clear or relevant

If the operator doesn’t understand the message, the alarm is useless. – For example, does HDR PNL 17LP3n-1B-C mean anything to you? Only if you worked this board for the last 30 years.  In addition, we have seen some point descriptions that are so long that you can’t read the entire thing and the pertinent information runs off the screen.

8. Using alarm settings to trigger interlocks/other automatic actions

Alarms and interlocks exist for different reasons, so it’s not generally a good practice to enforce both at the same setting. The alarm management team can’t change the alarm without changing the interlock and suppressing the alarm can disable the interlock.

9. Only doing bad actors

  • First of all, it’s not recommended by ISA 18.2
  • Alarms are considered singularly, not as a system
  • When does it stop?
  • Never really gets anywhere to solve alarm floods

10. Ignoring dynamic behavior

Alarms cannot be effectively managed from a static perspective. Further, dynamic management cannot be effectively added on after a static management process without seriously compromising any optimization, time or budget goal.

A quality alarm system needs to take dynamics into account, and the reasons are clear. Plants go through a number of different operating states, some of which have much more risk than others. The reason that we have alarms is for an abnormal situation. What is normal and abnormal often changes with the state the plant is in. Reason and data have both shown that a static rationalization is insufficient.

The ideal alarm management solution must extend itself beyond the normally static configuration capabilities of the DCS. Pressure to compromise the ideal because of apparent limitations of the DCS is to be resisted. The results of giving in will at best degrade the quality of the alarms and at worst mask valuable information. Objections may be raised that it’s impossible to achieve the ideal. It is not inherently impossible. Limitations of the DCS with regard to any dynamic objectives must be regarded as challenges to be overcome rather than inevitable and insurmountable restrictions. To the extent that they can be overcome on a generic and efficient basis, the alarm management process can capture the full optimization benefits.

11. Eliminating start-up and shutdown from alarm metrics and over-reliance on metrics

We’ve seen operating companies that configure their alarm reports so that start-up and shutdown alarms are not included in the reports that are sent to management. In essence, they meet the metrics for the run state and management doesn’t know any better. This is due to performing a static rationalization and you are only fooling yourselves if you think you are in compliance.

Metrics must never be used as a basis to compromise an alarm management process. If at the end of a well-designed, rational process, the results should violate some pre-established metric limits, then it is more appropriate to consider reallocation of operator responsibilities. There is an important distinction between a standard of measure used to assess the final results of a process and a justification to compromise that process.

An effective and meaningful assessment technique is to measure the impact of the results on ‘real-world’ performance. This can best be accomplished by comparison of pre-management historical event alarm experience against a simulation of the post-management alarm configuration, or vice versa, or dual simulations with both pre and post-management configurations for cases where there isn’t any historical experience. However, it is generally impractical to use apparently similar historical events from pre and post-management periods because the differences can be deceptively significant and render the comparison valueless.

In conclusion, the best solution to avoid pitfalls is to develop an alarm management process that allows the optimization of the alarms uniquely for each process operating state or state transition. When coupled with the ability to achieve dynamic management of the alarm configuration, the optimum solution can become a reality.

References

  • Leslie Jensen “The Pitfalls of Alarm Design and Benchmark Analysis” TAMU, 2001

  • Leslie D. Jensen, “Dynamic Alarm Management on an Ethylene Plant” Honeywell Users Group, Nice France 1995

  • Leslie D. Jensen, “Improving Alarm Management of Distributed Control Systems” The International Journal of Hydrocarbon Engineering, 1997 September

  • EEMUA, “Alarm Systems – A Guide to Design, Management and Procurement” EEMUA Publication 191; 1999 March

  • Tom Noble, “CPI Strive to Avoid Alarm Overload” CEP, 2001 March

  • Darwin Logerot, “Common Misconceptions and Pitfalls in Alarm Management”, ISA 2014