Identifying failure modes in an oil refinery process unit
Summary
Alkylation increases the high-octane outputs of an oil refinery and hence enhances the value of gasoline products. Alkylation reactions are catalysed by strong acids, usually hydrofluoric acid (HF) or sulphuric acid (H2SO4).
Safety had a high priority for the company in this case. The HF Alkylation Unit of the refinery was subjected to hazard studies on a regular basis, due to the dual hazards of flammable hydrocarbons and highly corrosive HF.
The most recent studies had used a Hazard and Operability (HAZOP) process. This study used Failure Modes, Effects and Criticality Analysis (FMECA) as a complementary approach that might identify additional hazards. The issues raised during the FMECA were recorded in a hazard register, which was reviewed regularly in management meetings to monitor the implementation of the recommended actions.
The study identified 225 hazardous failure modes, but only eight were rated ‘Undesirable’, requiring immediate attention. A further 59 were rated ‘OK with Controls’, to be reviewed and the controls strengthened if necessary. The remainder of the hazardous failure modes were rated ‘Acceptable’ or were considered to present only minor risk. These were not rated.
Background: HF Alkylation
The alkylation process in an oil refinery converts low value isobutane and light olefins (propene and butene) into a product (alkylate) that contains a high proportion of higher octane hydrocarbons (Figure 1). The proportion of octanes in the alkylate depends on the kinds of olefins in the feedstock and the operating conditions of the alkylation unit. Alkylate is ideal for blending into gasoline as it has both high octane content and low vapour pressure.
Alkylation reactions are catalysed by strong acids, usually sulphuric acid or hydrofluoric acid (HF). In this refinery, HF was the catalyst. Figure 2 shows a simplified outline of the process.
Figure 1: Typical alkylation reactions
Figure 2: HF alkylation process outline
Background to the case
Background
Accidents with HF Alkylation units have the potential to cause significant harm to people and the environment. They contain HF, which is extremely corrosive both internally and externally, as well as highly flammable hydrocarbons, so refinery managers require their alkylation units to sustain a very high standard of safety. To maintain confidence that alkylation units remain safe, most refineries conduct regular reviews of potential hazards and, where necessary, implement hazard reduction actions.
In this case the unit had been upgraded recently, and a detailed hazard and operability study (HAZOP) had been conducted as part of the change management process associated with the upgrade. Following an organisational restructure, a new refinery manager was committed to perform a technical hazard analysis of the unit’s equipment, procedures and control systems. He thought an approach different from HAZOP would complement the earlier study, and selected failure modes, effects and criticality analysis (FMECA) as the most appropriate option.
Scope and boundary conditions
The scope of the study was the refinery’s HF Alkylation unit, and in particular:
- Those parts of the unit that contained HF acid or were adjacent to them
- Systems that had an operating problem or were suspected of having caused a problem in the past.
All items of equipment in the unit (vessels, control valves, controllers, pumps and so on) had been reviewed and those that were priorities for study had been identified in a Preliminary Hazard Analysis (PHA):
- Seven items were a high priority for detailed review
- Eleven items were rated desirable to study if time and resources allowed
- The items that did not contain HF acid were considered to be of lower risk and similar to other equipment containing hydrocarbons throughout the refinery; they were omitted from this particular study.
The study used the most recent piping and instrument diagrams (P&IDs), previous hazard studies, and the process knowledge of the operations team. It considered the plant in four states: start-up, normal and abnormal operation, and shutdown.
Objectives
The objectives of the FMECA study were to:
- Examine the unit systematically to identify the potential hazards to people, refinery assets, operations and the environment
- Assess the failure modes of technical components of the unit and the methods of control, and decide on appropriate actions and responsibilities that would reduce or eliminate the risks
- Assess the operating processes of the unit and the potential hazards that might arise in operation, when human operators are interacting with the unit, and decide on appropriate actions and responsibilities that would reduce or eliminate the risks
- Document the study results in the form of a Hazard Register that could be used for tracking the implementation of improvement actions and form a checklist for ongoing monitoring and review.
The FMECA process
Overview
A Failure Modes, Effects and Criticality Analysis (FMECA) is a detailed and systematic examination of a process, design or plant to identify and assess the potential hazards of operation and the proposed methods of their control. It is usually undertaken by a group consisting of technical, operations and maintenance personnel who have a direct involvement in the process. Specialists from related operations at other plants are often included as well, to help avoid complacency and add a fresh point of view.
In this case we used a structured, facilitated brain-storming workshop to identify the hazards. We worked with the study team to examine the process equipment systematically, item by item, describe the modes of failure for each item and specify the potential consequences of each of those failure modes.
The study team included:
- Very experienced HF Alkylation unit operators, who collectively had a deep understanding of the plant ‘hardware’, the control systems, the operating console and the operating procedures for start-up, routine operations, shut-down and non-routine conditions such as emergencies
- Maintenance personnel with experience in mechanical, electrical and control system aspects of the unit
- Operations engineers with a detailed knowledge of the process and the technology
- An engineer with extensive operating experience of an HF Alkylation unit in one of the company’s other refineries
- The plant Operations Manager, who had authority to approve modifications to the unit.
Systematic development and analysis of scenarios
For the prioritised scope, we worked through the most recent P&IDs for the HF Alkylation unit systematically, item by item, to ensure that all the relevant equipment was addressed and to reduce the possibility of omitting any important aspects of the unit. We considered different operating modes for the item; start-up, routine and abnormal operations, and shutdown.
For each item, we identified feasible failure modes. For each failure mode, the effect of that failure was considered and scenarios were developed to identify the range of possible outcomes. For example (Figure 2), for one vessel:
- When the pressure transmitter fails with an erroneous low reading, a pressure control valve closes, leading to increasing pressure in the vessel
- The high-pressure alarm does not activate, as it is based on the same transmitter that has failed, so the operator is unaware of the high pressure and does not take action to rectify it
- The pressure continues to increase and causes the relief valve to open, releasing flammable gas
- This could lead to a fire, damage to part of the refinery and a significant loss of production.
Figure 3: Example failure mode scenario
Scenario priorities
The consequence and likelihood associated with each failure scenario were rated qualitatively to provide an initial priority rating, the criticality, for each failure mode. These were scored based on the rating system summarised in Table 1 and Table 2.
Table 1: Criticality rating
| Acceptable | Acceptable with the current level of risk and controls | 
| OK with controls | The controls must be reviewed, and strengthened if required and cost effective | 
| Undesirable | Action is required to reduce the risk | 
| Unacceptable | Immediate action is required to reduce the risk | 
Recording
We recorded our discussions and conclusions in a Hazard Register listing:
- The item and activity, with a brief description
- The failure mode and scenario, and whether the failure would be detected readily (revealed) or remain hidden (unrevealed) = The current controls and their anticipated effects = The consequence, likelihood and criticality from Table 1 and Table 2 = Any immediate actions that would lead to improved or safer operations, with the name of the person responsible for each action.
Outcomes
The FMECA identified 225 hazardous failure modes, of which eight were rated as ‘Undesirable’ and requiring immediate attention. Following the FMECA, the refinery management team reviewed them and developed engineering solutions to reduce their criticality. The team performed detailed HAZOP studies of the potential solutions, as a normal part of the refinery’s change management procedure.
A further 59 hazardous failure modes were rated as ‘OK with Controls’, to be reviewed and the controls strengthened if necessary. The remainder of the hazardous failure modes were rated as ‘Acceptable’ or were considered of only minor risk and so were not rated.
Lessons
Two-stage technical analyses
Detailed technical analyses are critical for some purposes, and in particular:
- As an integral part of the design process for new equipment and associated operating and maintenance procedures
- Whenever there are proposed changes or modifications to equipment, components or procedures.
However, detailed analyses such as HAZOP and FMECA need a large amount of effort. Rigorous but less detailed high-level studies, like the Preliminary Hazard Analysis used in this case, can be very cost-effective in initial studies:
- To set priorities for more detailed analysis
- To identify areas of potentially high risk where it is clear, without further analysis, that remedial actions should be implemented soon.
This limited the scope, keeping the study to a reasonable time period and not placing excessive demands on key personnel.
There are clear trade-offs between detail and effort. When reviewing an oil refinery as a whole (or indeed any complex processing plant), we recommend a two-stage process (Figure 4):
- Start with a high-level analysis to identify risky parts of the refinery, aiming to complete the work relatively quickly, possibly in a few weeks of elapsed time
- Follow up with longer, detailed hazard studies focusing on where they will add most value.
Figure 4: Trade-offs in technical analyses
Different hazard study techniques
There are many hazard study techniques of which HAZOP and FMECA are but two. A HAZOP is a straightforward technique: because it follows the process flow, it is clear how the node or item being studied fits into the overall system. However, unless each operating mode is identified and studied separately, it can be less effective for abnormal plant conditions: for example when refilling and restarting after a maintenance shutdown, or the ‘hold’ condition during a plant trip. Other techniques used in support of a HAZOP study can identify hazards and failure mechanisms that might not come to light in an initial HAZOP study, as occurred in this case when an additional eight unacceptable failure modes were identified by the FMECA study.
With expert facilitation, FMECA is a useful and flexible process when a plant must be examined in detail. For example, we have used it effectively in a different assignment to uncover failure mechanisms following a major accident where a combination of new electrical controls and original pneumatic controls were being used to operate a furnace. FMECA is logical and systematic, enabling it to deal with complicated process and control systems including the situation facing the furnace with old and new control systems working together.
Participants in a technical analysis
Not everyone involved in a technical analysis needs to know everything about the refinery, but there should be participants who understand in detail each unit to be examined and how it works in different operating conditions, including start-up, normal operations, abnormal operations, shutdown and maintenance. Sometimes an external expert may be invited, to provide additional technical information, experience and fresh ideas.
In this case the participants were all drawn from the operation, maintenance and management of the refinery’s HF Alkylation unit. All had a detailed understanding of how the unit worked and what was required to keep it working effectively. An engineer with extensive operating experience of an HF Alkylation unit in one of the company’s other refineries also participated, to provide an outsider’s perspective.
Human factors
A large percentage of process safety incidents and failures relate to human factors. This fact can be overlooked or given insufficient weight if an analysis is framed primarily in terms of the design and operation of a process, paying less attention to the behaviour of the operators.
The following list, which is not necessarily exhaustive, shows some of the features of human behaviour that can have a bearing on the occurrence of failures and the way operators respond to them:
- Time pressures
- Other causes of stress
- Task design and complexity
- Level of experience and effectiveness of training
- Procedural errors
- Human to technology interfaces
- Fitness for duty (e.g., due to illness or drug-related impairment)
- Process supervision
- Workplace culture
- Effectiveness of communication (e.g., at shift handovers, or about abnormal conditions).
One of the failure modes we considered in this case was operator error, how the operator could mistakenly do something that might cause a problem. For example, the operator could mistakenly close a manual isolation valve to a unit, resulting in a major process upset and a potentially hazardous condition.
When assessing the criticality of this failure mode we had to understand the likelihood of an operator making such a mistake. There are guideline tables that assist in making this kind of estimate. They take into account many factors, including:
- The training and standard of the operators
- The speed of the required response and the pressure the operator may be under
- How easily the item requiring attention can be identified and other factors, some of which are noted above.
This ‘textbook’ guideline information was used to provide a starting point for assessing criticality here. Discussions with the operators usually resulted in adjustments to the likelihood, taking account of the specific conditions in this refinery. No two operations will be identical to one another so it is very important to involve plant personnel in the assessment.
An example from this case illustrates both the value of plant-specific knowledge and the complexity of examining human factors. In this refinery’s operating environment, two isolation valves, for different parts of the unit, were located beside one another. It was recognised that, on a wet night shift, the likelihood of a mistake during isolation, confusing one valve with another, would be far higher than the guideline table might suggest. The frequency with which such a situation might arise, how frequently isolation would be required, and the likelihood of an error being made were all considered in the context of this unit, building on the standard tables but adjusting for the plant and the operators’ specific characteristics.
It was important that experienced and knowledgeable operators participated to adjust the standard ratings given in the guidelines and prepare a realistic assessment of the risk.
Human factors are difficult to include in an analysis, but they are often a central feature of potential failures of control. It is essential to take them into account. Drawing on the knowledge gained by hands-on experience is critical in any kind of hazard analysis. In this case, that was provided by personnel from the unit under investigation and an external specialist with experience in the same type of process at a different plant.
Engineers and operators understand a plant from different points of view. Engineering expertise is required to understand the technical operation of a plant, its processes, instruments and control systems. Operators, including supervisors and managers, bring a greater understanding of how personnel on a plant behave and will react if unexpected events happen. Drawing on both groups in the analysis decreases the chance of anything being missed and increases the chance of likelihood and consequence assessments being realistic. This, in turn, should provide senior management and decision makers with confidence in the outcome of the analysis and ensure their support for the actions it recommends.
References
API RP 751 (2021), Recommended Practice for Safe Operation of Hydrofluoric Acid Alkylation Units. Fifth edition, American Petroleum Institute, Washington DC.
IEC 60812:2018 Failure mode and effects analysis (FMEA and FMECA). Third edition, International Electrotechnical Commission, Geneva.
IEC 61882:2016 Hazard and operability studies (HAZOP studies) – Application guide. Second edition, International Electrotechnical Commission, Geneva.
- Client:
- Oil refiner
- Sector:
- Oil and gas
- Services included:
- Technical risk analysis and hazard studies