[Sticky] Safety of Society – The Role of the Engineer
During the next 50 years the demands on our energy supplies, transportation facilities, water, food and information supplies, for new medicines, and our need for environmental protection, will increase many-fold. This will require increasing investment in traditional technology, and in new technologies. The demands on the engineer to be able to create these supplies will intensify.
In many cases, the needs can only be met by compromise. Increased water supply can for example probably only be met by increased draw down on rivers, reducing the quality specifications for water supplied, and allocating more land reservoirs. Solving one problem can in this way often lead to creation of new problems, which require more systems to solve them.
In order to be able to balance the different requirements on new systems, we have invented methods such as environmental impact analysis and risk analysis. These methods have in many ways been very successful. But they have weaknesses – the methods generally attempt to predict the future, but our capacity to do this is increasingly limited by the scale of our projects and systems, the limits of what we know, and the sensitivity of our success to detail.
One of the biggest challenges will be to create “problem free” projects. At present a significant number of the major systems which we make fail. This is in spite of the increasing reliability of our equipment. A major problem is complexity. The problems arising are becoming mostly ones of errors and limitations in planning, design, and management.
If we are to meet the needs and challenges of the next 50 years, we will need considerably more expertise in multi disciplinary thinking and analysis. We will need to place or forecasting and planning techniques on a sounder basis. And we will need developments in teaching, to show the next generations not only how to avoid our own mistakes, but also their own.
Introduction – The needs and demands of society
Population in the world is projected to level out in 2050 at somewhere around 9 billion persons. There are large uncertainties on this number. Population pressure or increased standard of living may lead to lower birth rates. War and disease may limit population. It would be wise though, to at least start to plan for significant population growth.
It will be impossible to supply this number of persons with even the standard of living of the poorest of the Western countries using existing technology. There will be pressure to develop food technologies which do not require such heavy use of land. It is worth noting that the per capita production of grain actually fell between 1985 and 2000. Technologies which could be anticipated are process plant production of foods, a move to vegetable proteins, and more intensive use of land, closer to market gardening techniques. More efficient use of food can be expected, with fewer losses, better distribution, and greater utilisation of the produced mass. All these techniques will require a greater use of water, and more energy.
The world at present runs primarily on coal, oil, and natural gas. Oil supplies will begin to drop significantly at some stage, anticipated to be about 2020. From my own experience, I can see that increasingly extreme recovery techniques are being used to extract the remaining oil from fields. Further new technologies will be needed to meet increasing demands for energy for transportation especially.
In many third world countries, city transport is close to disastrous already, with heavy clouds of pollution from badly tuned diesel engines. This is reflected in significantly higher levels of chronic lung ailments. New technology and new systems are already needed to reduce this problem, which will in any case get worse. One of the main difficulties is that the new technologies will need to be competitive in terms of capital investment wit the cheap use of old diesel engines. This will occur in any case when diesel fuel becomes too expensive to use, but the actual need ifs for better technologies which do not limit the possibilities of poor people by making transport simply too expensive.
Water is at present the limiting resource on economies and living standards in many places in the Middle East and Central Asia. Solutions to the problem at present are increasing construction of reservoirs and diversion of rivers. There are severe limits to this. The alternative is desalination, which can in principle supply effectively unlimited amounts of water, but at a severe cost in energy.
The ability to drive private cars, particularly to work in cities, has made modern city life a challenge in many places. In most western cities, and in Japan, problems have been stemmed by construction of extensive advanced highway networks, but it cannot be said that the problem has been solved. Traffic delays are a fact of life in most large cities, and a large part of the population must spend a large fraction of their working day in queues. Solutions are better location of workplaces, and new, high quality public transport systems. These require both investment, and good entrepreneurial sense in order to make systems work.
The need for entrepreneurial and marketing expertise is one which may sound surprising to some engineers, but should hardly be surprising. Some projects are decided politically. Not all “political” decisions are failures, but many are. An example of the difficulties of political planning of systems is given by the two new Danish bridges / tunnels, across the Great Belt and between Denmark and Sweden. Both are engineering triumphs, both had major problems one with weather interference with concrete hardening operations, one with flooding of a tunnel. One is a major success, with traffic projections already exceeded by a large margin, one has too little traffic to satisfy economic needs. Something that planners and engineers need to learn is that people are not machines, and are not nearly as predictable. Just because planners think that people should use a facility does not necessarily imply that they will do. We may not need to market communal water supply, but we will need to market electrical private transport, public transport, and new energy sources.
Every project which is a failure, or is even delayed or made more expensive, robs us of opportunities to complete other projects, both by taking scarce capital resources, and by tying up the engineers, planners (and politicians) who are needed to make them.
Table 1 shows some of the development problems which we can expect to need in the coming years, and some of the problems which are likely to arise:
How things go wrong
One of the major triumphs of modern engineering is the enormous increase in reliability of equipment. In the 1960’s, repairs of main central computers could be expected every day. Today, personal computers which are much more powerful, have thousands of components, and work for years. Televisions used to fail regularly on a yearly basis. Now one can expect most television sets to operate until they are obsolete. Even an established technology, such as a passenger car engine now has a working lifetime up to four times that of similar specified engines in the 1960’s, and requires less maintenance. In process systems which I have studied in some depth, the failure rates for the best components today is one tenth of that for similar functioning components in the 1980’s (Taylor 2000)
The reason for this enormous improvement in reliability has primarily been the work expended in researching the root causes of component failure, and improvement in materials, production processes, and detailed design of components. Reliability and risk assessment techniques have contributed a little to the overall improvement, by providing a background of reliability knowledge, and an incentive for improvement.
One side effect of this is that component failure, which is the traditional topic of reliability and risk assessment, has increasingly become a minor contributor to the overall pattern of of systems failure, and particularly to the catastrophic failures. Figure 1-3, following pages, show statistics for the root causes of 120 accidents occurring in the process industries. (Data taken from Drogaris, 1990, my analysis) As can be seen, management error or inadequacy, and design error, have come to dominate the causes of accidents.
When we consider other causes fields of work, and particularly large systems, design error begins to take an even more important role. Failures on space systems, for example, are nearly all dependent on design weaknesses or errors. This is not because the designers are poor, but rather because the traditional causes of failure have been eliminated by quality control, testing, and good component design. Systems design and management problems have come to dominate because other causes have been eliminated.
One interesting aspect of this is that traditional “human error”, by which was meant operator error, is also beginning to diminish in importance. Particularly the use of computers for sequential control, with heavy interlocking, has started to reduce the scope for error in the control room. The effect of this does not show in industry statistics yet, since the change in design practices is relatively new, but the effect can be seen on individual plants and units. Unfortunately, the same technical improvement is not observable for maintenance operations, but the heavy emphasis on process safety management in some industries and countries is showing at least a diminishing frequency of lost time accidents, and of major systems disruption accidents, though not of employee fatalities.
We make errors, but why ?
Design engineers and operations engineers are trained to create, and to get things done. The good ones develop a pride in their ability to get systems to work. Managements prize and reward this ability, and such people are promoted as project, system and plant leaders.
One of the abilities which enables good engineers to work is the ability to concentrate and focus, to set goals and then to strive for them. This kind of performance sets a premium on forming a model of how things should work, and then committing all resources to ensuring that projects and systems do work in this way. The emphasis on success is matched by a tendency to suppress thoughts of what might go wrong. While those with experience, and a reflective attitude, may do some contingency planning, it is very common for highly goal oriented engineers to reject all activities which take resources from their main goals or delay projects.
An example of this was the manager of a large gas compressor platform, serving several fields. In an audit, the maintenance procedures were found to be inadequate, with no proper permit to work system. The manager stated that he could not get work done quickly enough when using such systems. He preferred a system of so called “safe working procedures” which he expected all employees to know. He rejected the audit findings. Six years later, a maintenance team made an error, opening a vessel after shutting off the inflow side only. Gas escaped, and the ensuing release killed 11 people. It also put the gas fields out of operation for nearly one year.
Similar problems can be detected in the accounts of the Challenger space shuttle accident, in which the potential for failure for booster rocket seals were a known design problem, but were “evaluated” to be of secondary importance.
The cases show one feature of such judgements, a reluctance even to allocate resources to investigate potential problems in detail, if it means interference with the primary goal of the organisation, and especially the project leader. One should remember that it is easy to judge in retrospect, but at the time of the decision, problems are often tentative, and poorly documented with practical data. Engineers and especially project managers are forced to make decisions every day, often with inadequate data, and may easily get into the habit of making “psitive decisions” unless negative information is concrete and definite.
This kind of problem is now being solved by many companies, by establishment of fixed procedures for safety assessment or risk analysis. In many companies, hazard and operability analysis is made a precondition for budget approvals, placing safety assessment on the critical path to project approval. From observation, though, this is still an area where there is conflict, even in the companies which insist on proper risk assessment.
The picture is not all bad. Many goal oriented project managers today recognise that good analysis at the start can actually speed their projects, and use the hazop process in particular to ensure communication between different disciplines and project groups. A few of the project managers I have worked with have been very critical of the existing risk assessment procedures, and have worked hard to improve them, striving for the “perfect project”.
System complexity and design communication
One of the problems arising in many modern systems is their complexity. This often means that many people must be involved. It is nearly impossible for everyone in the project to know everything about the rest of the system. Each engineer works with specifications delivered by other engineers. If the task is a standard one, or if the specification is clear, there is no problem. In complex cases, though, it may be impossible to get the specification right first time. Specifications are modified and upgraded through the course of the project, often with little documentation.
This approach is effectively one of “design by testing” or “trial and error”. By the time the system is commissioned, it may work well, but there will probably be weaknesses in the design. These are gradually revealed during operations, in most cases by alert operators who limit the effects. In some cases the results are revealed as an accident or a complete system breakdown.
As an example of this, an instrument engineer was provided with a specification for a temperature alarm on a chlorine vessel. There was no nozzle in which to mount the sensor however, so he placed it on the outlet. The alarm failed to work when flow was stopped, because the sensor did not touch the cold chlorine. Due to a separate failure, the chlorine continued to boil, cooled, and cracked heat exchanger tubes, leading to a chlorine release ….
This case illustrates another feature of the pattern of failures in engineering. Specifications are written to describe what we want, but not why. As a result, the redundancy which could allow colleagues to interpret the specification precisely according to objective or purpose, and even to suggest better solutions, is lost. The use of hazop groups to discuss performance of systems, and to provide a forum for discussion, is one of the more promising ways to overcome the weaknesses.
Another case was the failure of the first Ariane 5 launcher. Here one piece of software had been developed for the Ariane 4 launcher, and approved for space operations. On Ariane 5, the parameters of the system were different, and a gyroscope initialisation routine failed. The launcher guidance system failed, and was destroyed remotely to prevent accidents. In this case, the approval for the first launcher was correct, but the limitations of the software were not recognised. Nowadays, reuse of software is supposed to be made only after a function and limitations review.
The devil in the details
One of the problems, even on large projects, is that some small detail can wreck a large concept. A rubber seal on the shell of the Challenger space shuttle, one of thousands of such design details, cost a crew their lives, and the space programme several years of progress.
As another example, a tunnel boring machine was designed with an ir lock, because it was expected to meet pockets of water. Technicians could access the cutting head of the machine to replace cutting picks and disks, and for general maintenance. The design envisaged that the lock would be closed whenever there was a risk of the drill meeting a pocket of water.
In practice, the drill tended to clog in some types of ground. To get over this, operators were stationed at times at the drilling face and sprayed water on the cutter. This required water hoses, and the water hoses had to pass through the open door. As can be predicted with hindsight, when a pocket of water was met, the tunnel flooded. Fortunately, the operators all escaped, but the project was delayed by six months.
Detail design guidelines for airlocks indicate that there should always be lead throughs alongside them, for hoses, power cables, and attachments for safety lines. There should also be proper and tested procedures for escape. Such sets of detail design guidelines are openly published in some cases, for example for LPG storage design, for oil tank design, for pressure vessels, and for fire protection. Large companies often have their own company standards. Many areas of design are not covered by detailing standards, however, and smaller companies generally do not have access to guidelines for detailing at all. Academic access to detail design information is minimal. As a result, each young engineer must learn afresh the details, often by copying from earlier designs. If the young engineer is lucky, he will have an experienced mentor, who can give some of the background and experience.
Ignorance of physics can kill
Often, the causes of systems failures and accidents is an unusual or unexpected effect, which the engineer has difficulty in understanding. This is particularly so for effects which occur once in a lifetime or less.
An example of this is the hammering effect which can occur when liquefied gas, such as LPG, ammonia, or chlorine is filled into a pipe. When filling is almost complete, the pressure causes the gas bubble to collapse, and the irresistible momentum of the flow meets the unmoveable wall of the pipe or vessel. Unfortunately, the wall is in some cases quite moveable, by fracture. This happened at the Texas City refinery in 1984, releasing a cloud of LPG, which ignited. The ensuing explosions destroyed about half of the refinery.
Some examples of physical effects which are not widely know to engineers, and sometomes are overlooked by complete engineering organisations are:
- Water hammer
- Filling pipes
- Closing valves
- Liquid slugs in gas pipes
- Opening a valve draining a pressurised vessel
- Liquefied gas bubble collapse
- Spurt on filling
- CavitationFlow excited pipe vibration
- Organ pipe resonance
- Vertical two phase flow
- Slug flow
- Vibration transmission via supports
- Vibration overstress
- Interference and fretting
- Evaporation cooling
- Joule effect cooling
- Bump boiling
- Boiling jump
- Crud build up
- Accidental concentration
- Evaporation cooling
- Joule effect cooling
- Bump boiling
- Fire induced tank explosion
- Crud build up
- Accidental concentration
- Pitting corrosion
- Crevice corrosion
- Erosive corrosion
- Contaminative corrosion
- Galvanic corrosion
- Combustive corrosion
- Corrosion catalysis
- Solid blocking, arching
- Float off
- Foundation undermining
- Settling stress
- Support misadjustment
- Expansion stress
- Pull out
- Vessel ballooning
- Pressure unscrewing
- Cocking, ratchetting
- Stick / slip
- Wheels climbing on rails
The most difficult aspect of these problems is the need not only to know of these effects, but to be familiar with them, and to be able to judge when they will be significant. To my knowledge there is no textbook which deals with all these problems, and in many cases the individual engineer must learn most of this from experience.
The actual teaching of engineering has changed style over the last thirty years, with much more emphasis on mathematical techniques, and less on practical design. As a result, it is noticeable that young engineers often have difficulty in understanding problems, even when they are explained. In many cases I have found it necessary to demonstrate effects by experiment, before understanding is achieved.
Organisation of engineering activities is an important feature of success or failure for projects. The traditional hierarchical, discipline bases organisation of instrumentation, power, process, structures, planning etc. fails regularly in modern accelerated projects. Too much must be done in parallel. Any error or misunderstanding in one group affects others, who then have additional correction work, often without accompanying resources. The first casualty of such problems is quality control. An additional problem is that in such organisations, particularly the largest ones, the individual engineer never sees his own product, and the possibility of learning from experience is reduced to a minimum.
The situation is better for smaller production companies with in house project teams, or with allianced engineering companies. These are not the companies, however, who carry out very large projects.
Project oriented organisation is usually much better in achieving goals reliably, provided that all the necessary engineering disciplines can be staffed with suitably experienced people. Matrix organisations, in which each discipline engineer has a framework organisation looking after tools, standards, consolidating experience, and providing continuity provide an improvement on simple project organisation, but are often difficult to manage, with inherent load balancing problems.
One problem which often arises when resource and budget limits are tight, or when senior management focuses on short term profit, is that short term thinking comes to dominate. This is particularly the case in turnkey projects, where the project leader’s objective is often to get the plant working and accepted as cheaply as possible. The combination of a highly qualified and cynical turnkey project leader, and an inexperienced or overworked project manager, is a recipe for disaster during the early years of operation, and a constant drag on systems operation.
As an example, in one audit the maintenance budget was found to have been reduced by 50%, and a complete system overhaul postponed by one year. The manager was promoted to a different plant at the end of the two year period. His successor was left with an ailing plant in acute need of repairs and overhaul. This is one of many examples. In another case, a large ammonia plant was found to be dependent on one 10 HP pump. The normal redundant spare had been “saved” by the turnkey project team. As a result, the plant operated with a continuous ammonia leak, and an area marked as hazardous. Repair was just too expensive since it required a shutdown costing $1/2 million. ( After the audit a spare pump was installed)
Risk analysis in engineering
Risk analysis and safety analysis have by now become accepted practices in most industries, with quite varying quality of work. One of the most important effects of this has been to ensure a greater understanding of systems by the engineers, and a better dialogue between them. This tendency will increase, particualrly in Europe as a result of initiatives like the Machine Directive and the Seveso II directive, which require risk analysis. One unfortunate tendency though is for risk analyses to be carried out by consultants with little contact with the design or operating staff.
To achieve full benefit, in fact the most important benefits, the design and operating staffs need to be intimately involved in the analyses. The role of consultants should at most be facilitator, or alternatively they should be directly involved in the project as part of the team.
The cycle time for risk assessments, particularly those to be transmitted to authorities, has traditionally been several months, in some cases a year or more. The typical maximum time for design decision making in projects varies from days or weeks in the chemical industry, to months in nuclear power and aerospace industries. To be relevant, risk assessment needs to be carried out on a short time scale. Fortunately, with the tools available today, a typical time scale for a risk assessment on a new process plant can be as short as two weeks. This requires experience and continuity in establishing resources however, and it is only a few companies which can achieve this at present.
One danger is the assumption that safety assessment will solve all project problems. Table 2 shows the results of an assessment of completeness of 35 risk assessments made by project teams. (Taylor 1996)
In order to overcome this problem, effort is needed in achieving quality. Maximum use of the company’s own experience is essential. One thing which can increase the quality is to require that all accidents which have affected similar plants, and for which accident reports are available, are reviewed by the project team.
The Cassandra effect
Cassandra was a princess of Troy. She was blessed with one gift and two curses. She could forsee the future, but only the bad parts. And no one would believe her.
Risk assessment is an exercise in prediction of accidents and failures. It is doomed to be misbelieved by engineers, planners, and politicians, if the results do not coincide with their objectives. In over 300 projects over 30 years, I have seen projections actualised in 12 cases, with many fatalities. In only one case did the management actually refuse a recommendation.
In the other cases the accidents resulted from delays, “difficulties”, and organisational problems, which delayed improvements until after the accident occurred.
To be useful, the risk assessment must be believed. One technique which is essential, is for the analyst to be able to point to earlier accidents which support the analysis. It helps if you have photographs, because these excite many parts of engineering managers minds, and make them think. The effect is also rapid. Another essential is to make recommendations practical, and to give them priorities, A good rule of thumb is that any committee meeting can at most make five decisions, so if you have more recommendations than this, group them.
Recipe for problem free projects ?
With these observations, is it possible to draw some conclusions for the challenges which will meet us in the 21st century ?. One thing that is certain is that projects will become larger in scale and more complex. Another is that design and management problems will increasingly dominate. Some lessons which we could perhaps learn from the observations described are:
- Engineering organisation is important, the properties and performance of the organisation need to be studied, and improved.
- The quality of safety and risk assessment, and of quality assurance needs to be studied.
- Organisations need to build risk assessment intimately into their design and operations. Risk assessment is not a specialist activity, to be carried out by a specialist priesthood, it is needed by every engineer.
- Therefore, all engineers need to be taught the rudiments of risk assessment, safety engineering practices in their own disciplines, and the practice of design review.
Above all, we need to recognise the fallibility of engineering in order to be able to make it less fallible. Remember, engineers and managers are also human.
Taylor, J.R. - 1999 Review or failure rates data for risk assessment, ITSA 1999
G.Drogaris - 1993 Major Accidents reporting System. Comission of the European Communities, 1993
Taylor J.R. - 1996 Quality and completeness of risk assessment, ITSA 1996
Taylor Associates ApS
Erantisvej 5, 4171 Glumsoe, Denmark
ChillMatic Refrigeration Forums – ChillMatic is a leading independent UK Refrigeration review website and community for end users and engineers.