Tom Nolan and Katherine Persac
After hearing a presentation recently, I started thinking about the popular seven steps for alarm management and about what parts work and what parts don’t. I asked one of our alarm management engineers for his opinion and we came up with the following. I would appreciate hearing any feedback or thoughts you have about the subject.
Step 1: Develop, Adopt and Maintain an Alarm Philosophy
Step 1 is worthy advice. If you have alarms, you should have an alarm philosophy, but don't create your philosophy alone. Acquire help from alarm management experts that are able to guide you through the process and help tailor your philosophy to your specific needs. Because the entire alarm management program is dependent on and audited against the alarm philosophy, long term decisions in the alarm philosophy will affect the entire life cycle of the alarm management program.
Step 2: Collect Data and Benchmark Your System
Benchmarking is a great idea, but do not make the mistake of thinking that this is a numbers game. While it is good to know where you started, good alarm management relies on the quality of the alarms, not the number of alarms.
Step 3: Perform "Bad Actor" Alarm Resolution
Step 3 can often turn into a game of Whack-a-mole. Remember that all alarms need to be rationalized, and receiving alarms is not a bad thing. If alarms are coming in at a high frequency and are legitimate alarms, that is a good thing. Operators need to know and actions need to be taken. If these are not legitimate alarms, they should be eliminated through rationalization. If they meet the criteria of a good alarm, they need to be there.
With that said, stopping after steps 1-3 is not a good idea, even if you do see some reduction in the alarm loading. Again, this is not a numbers game. Numbers can indicate a problem, but they cannot indicate if there is no problem.
Step 4: Perform Alarm Documentation and Rationalization
Step 4 is the only real way to insure the quality of your alarm system. Assembling a team of operators and engineers and leading them through the causes, consequences and required actions of alarms, leads to consensus and by in on the validity of an alarm. Using the team's process knowledge to add dynamic management and shelving to the rationalization process is key to achieving ISA 18.2 levels of performance when they are most needed in startups, shutdowns and upset conditions.
Step 5: Implement Alarm Audit and Enforcement Technology
Make sure to implement step 5 in a way that is effective and is not a significant manpower burden.
Step 6: Implement Real Time Alarm Management
The performance that your alarm system needs to provide is that laid out in ISA 18.2. To achieve ISA 18.2 level performance, a static rationalization alone is not adequate. Dynamic alarm management is not just nice to have.
Ignoring dynamic behavior will not deliver the needed performance, which should come as no surprise. A static rationalization optimized for the running state will undoubtedly produce alarm floods, even if it is done well when the plant is out of the normal running state.
Step 7: Control and Maintain Your Improved System
Processes will always change with time, condition and feed stock. Instrumentation will fail, be added and removed. Many things will affect the alarm system over time. Monitoring performance is needed to make sure that hard won improvements do not degrade over time.
Over the years, we have learned that alarm management is not a numbers game. One alarm that comes in at a bad time and creates confusion that leads to a wrong decision can have devastating consequences. If the alarm is in the DCS, it should meet the criteria of a legitimate alarm. The quality of the alarm is most important, and alarms needs to be rationalized.
This is not about removing alarms, but about receiving the right alarms. That may mean getting removing some, adding others if they are needed or alarming current alarms in a more efficient way.
A quality alarm system needs to take dynamics into account, and the reasons are clear. Plants go through a number of different operating states, some which have much more risk than others. This reasons we have alarms is for an abnormal situation. What is normal and abnormal often changes with the state the plant is in. Reason and data have both shown that a static rationalization is insufficient.