Thoughts on Maintenance Safety and Training

One of my favorite expressions comes from the use of a “petard,” a device invented in the sixteenth century. The expression is to “be hoisted by your own petard.” The word is French and refers to a shaped bomb used to breach the walls of a fortified structure. I guess being blown up by your own petard was common enough to make it an expression. I don’t think there is anyone that wouldn’t agree that despite our efforts, most of our mistakes have been made before and will likely be made again. An operational example of such an error might be wire strikes during helicopter operations. During maintenance, failures to properly torque a jam nut or clear an area of foreign objects falls into this category of errors. We all need to recognize that it is much more difficult to put safety into practice than it is to talk about safety improvements. We must be keenly aware of the need for ways to minimize risks to people and equipment and also ways to provide institutional safety management. Too often we think of maintenance safety as beginning on the shop floor rather than in the office of the CEO. When the CEO stands in front of everyone and says, “Safety is job No. 1,” and then returns to his or her office with no further impact on policy, a huge part of the safety equation can go unsolved. A safety inspector might be able to check the boxes showing that all the firefighting equipment is in place and properly maintained — but if the insides of the refinery pipes are old and corroded, the resultant mayhem would not be the result of an error made by one person who might have been a little careless and bumped into a weakened pipe, causing the leak that allowed the fire to take place. It would be an institutional failure to deal with the real issue of aging pipes carrying combustible material. Perhaps the CEO thought that pipe replacement could wait for the next fiscal year due to a tight budget. My point is that we often set ourselves up for potential errors by neglecting to see what the real “root cause” is for the sorts of errors we are likely to make.

SMS and Root Cause

The root cause for most maintenance-related accidents isn’t that someone forgot a step or assembled something incorrectly, even though that is what made the incident known. The root cause is found in why the step was missed or what factors allowed an assembly to be put together incorrectly. We should ask, “Why didn’t someone follow the written procedure?” rather than simply stating it is important to do so. We shouldn’t simply retrain or reprimand someone without understanding why they skipped or missed a step. The days of firing someone because they made a serious mistake, and without looking into why that happened, should be gone if they are not already. Only when we get answers to questions like this can we make appropriate changes to how we operate and thereby improve safety. We need to understand that since there are usually multiple reasons errors are made (the “links in the chain”), if we only deal with the last link in a chain then we leave the other links in place to bite us again. Our maintenance training should not only include how to perform certain tasks but also how to recognize conditions that lead to the first (second or third) link in a chain of events causing a maintenance error. This is at the heart of what a safety management system (SMS) tries to accomplish. Having an SMS plan in place doesn’t mean it will have the desired result of fewer errors. People might still sign off work long after it was completed and inspectors might sometimes still place a stamp without really inspecting the work. Documentation does not equal safety, but we often act as though it does.

System Safety

System safety (the precursor to SMS) has been part of aviation since the mid-1940s. Department of Defense (DOD) MIL spec (MIL-STD-882E) details how a systems safety program should be configured for DOD projects. If you read this document and then read an article by Fred A. Manuele (in “Professional Safety,” Oct. 2011) you will understand why the DOD specification only gives you part of the solution. Manuele makes the point that it is a myth that the principal cause of occupational accidents is unsafe acts by individuals. His point is made in saying, “the emphasis is now properly placed on improving the work system, rather than on worker behavior.” The implication is that worker behavior will change for the better if the institution promotes it. Systems safety, as detailed in MIL-STD-882E, is an engineering approach that is data based and can leave you looking for answers to the question of what to actually do with all the data your safety program collects and how you interpret it. We are always quick to say “be safe” at the end of a message, but we don’t spend enough time telling people HOW to be safe.

Rules and checklists by themselves do not make us safe; they require great effort and attention to detail to make them have the desired effect. It takes much more continuing effort to make an SMS program work than it does to write one. Looking back at errors is the easy part. The past is relatively easy to talk about; it is fixed. The future is blurred and hopelessly complex with many possible outcomes. We pick apart our errors well, but how do we prevent them from happening again? The older I get, the more I use a mini-checklist by asking, “What is the worst that could happen?” when I start or finish a task. Even so, I sometimes find myself saying, “Well, I didn’t see that coming.” For example, I have found that using chainsaws and ladders at the same time is a good time to ask that question (but that’s another story).

B-52 Jack Screw Assemblies

In one of the most interesting aviation maintenance errors I have read about, a B-52 lost both of the inboard flap panels after takeoff. The flap jack screw assemblies had been overhauled and parts were left out. One interesting thing is that in the opinion summary of this report, the root cause is stated to be the fact that an individual failed to install two retaining caps and that the ancillary causes had to do with failure to provide sufficient oversight on various levels. Although it might just be semantics, I would state the root cause as lack of oversight and that failure to install all the parts was the result of this problem. The institution set up the technician to fall into this trap.

Amazingly, nobody got hurt. The aircraft damage was to the tune of $1.8 million dollars. No matter what we do, we will find a way to circumvent our precautions. If you take away nothing more from what I’m writing, it should be that no individual and no procedure we devise will keep us free of maintenance errors.

The Unpredictability of Human Behavior

Our efforts might be relentless, but they will never make us 100-percent effective/safe because human behavior is too hard to predict for the narrow stream of events that lead up to a maintenance error. Acknowledging our failure to achieve the goal of 100-percent safety is a first step in trying to see the root causes of errors to come. We know the next maintenance error is around the corner despite our rules and checklists.

Any time a technology change happens is the time to be extra careful. In 1935, the Boeing 299 (B-17 prototype) took off on a demonstration flight with the empennage control locks engaged. It crashed and burned because the crew had no checklist. It was perhaps the first aircraft of its size that had control locks that could be disengaged from the cockpit. That’s from where checklists came. We still fail to use them as often as we should. Pilots need to know some things by rote because of the brief time they might have to deal with a problem. Mechanics need to be more introspective, almost as if they need to be able to watch themselves work, and try to see how the links in the chain of events get laid out before that step occurs that appears to have caused the error.

It probably isn’t only that I failed to read the book that caused the mistake, but maybe that I was in a rush because the boss had innocently asked how much longer it would be taking me and I had planned to leave a half hour early today. Often we have our scheduled maintenance timeline laid out to perfection when something unscheduled pops up, requiring a reshuffle of our priorities. The pressure to find some wiggle room can result in one or the other being slighted. This can be especially difficult when the unscheduled issue requires large amounts of resources (which might or might not be immediately available) to investigate and correct.

Here is a George Carlin quotation I like: “When someone is impatient and says, ‘I haven’t got all day,’ I always wonder, how can that be? How can you not have all day?”

We spend a lot of time in human factors training discussing what causes maintenance errors. Examples are things like being in a rush, complacency, fatigue, failure to use written documentation, lack of communication, normalization of bad procedures, distractions, etc. Thanks to Gordon DuPont and Transport Canada, we have the “Dirty Dozen.” If you train people to recognize these and teach how to avoid becoming prey to them, it should reduce errors made. The difficulty is in seeing the opportunities for error coming and knowing that for every Dirty Dozen factor, there are deeper causes that we do not often address. This is because root causes are usually buried in human nature. You can’t just put up a “Dirty Dozen” poster and think, “OK, all we have to do is NOT fall prey to any of these things” and expect that to happen. Knowing that you feel like you are being rushed doesn’t in any way offer relief from feeling that way. It’s one thing to talk about not working through fatigue but quite another for a supervisor to say, “We’re all tired — let’s go home early and come in late tomorrow because we’ve worked 10 hours a day for the last three days,” or for you to say, “Sorry boss, I’ve been here for eight hours and am too tired to get in the truck and fix a broken machine in the field.” Even if we did do these things, we are often fatigued not because of how long we worked on the job but because of what we do outside of work. The real issue might be how long a commute someone has or whether or not they are getting enough sleep.

We might skip the written documentation not because we are lazy but because we honestly think we know what it says. We never imagine when distracted that it will end with, “Where did I put that tool?” We don’t know we forgot something BECAUSE WE FORGOT IT! I once worked with a pilot who lit off the engine in his Jet Ranger with the blades tied down. He never wanted to do that again, so he put a white sock on his cyclic stick post flight so it would remind him to untie the blades if he hadn’t done so. This worked for him for a while, and then he did it again, with the sock in place. A good pre-flight using a checklist is what he really needed to do.

The Six Emotional States

We use the “Dirty Dozen” as convenient labels for what causes errors, but I think what is behind these labels is where we need to be looking. You probably know of the “seven deadly sins” which are pride, envy, anger, greed, laziness, gluttony and lust. I’m going to borrow them out of context and drop gluttony, lust, anger and greed for this discussion, but I’ll add forgetfulness, deceit and fear. Instead of the seven deadly sins we now have the six sometimes-good, sometimes-bad emotional states: pride, envy, laziness, fear, deceit and forgetfulness. We can control these pretty well except for forgetfulness. In my mind, these are the root causes behind the “Dirty Dozen.” They are the things that allow us to make the choices such as, “I can skip that step in the procedure,” or, “No time for the written checklist, I know it by heart.” The weird part is that some of these can strengthen our ability to produce a safe product or they can work against us. You can be proud of your work (good) or you can have false pride (bad) and think your work is better than it really is.

A coworker once told me about his uncle who had a variety of gasoline-powered yard tools. After each use he’d remove and clean the spark plug, drain the gas, re-paint the muffler, lube wheels and cable controls and so on. Who in their right mind would do this? His uncle’s attitude was, “Who wouldn’t do these things to assure they would operate as they should the next time he needed them?” He was proud that his tools started when he went to use them. My tools don’t always cooperate but I just can’t bring myself to spend the time doing what this fella did. Who is right?

The Carlin quote is so obvious that it hurts. How can we not take all the time we need to make sure we turn out a product that is as safe as possible? Do we ever take all possible steps to ensure safety? Where do we draw the line? Anyone who has spent years maintaining aircraft has made a mistake of omission, forgotten or been distracted from completing a task, failed to use the documentation, thought they knew better or were too embarrassed to ask a question. A small step like not calling on a co-worker to check your work is easy to skip because it saves time. Why have someone check to see that you’ve put in four cotter keys (or was it five?). Having been an inspector, I understand why, when called on to inspect a job, it was more common than I’d previously thought to find something wrong. I believe if our jobs were reversed, the person who would be looking at my work would probably fill the job of inspector as well as me and find the same sort of discrepancies. I believe it’s similar to why we often remember something when we stop thinking about it. The person in the inspector role knows their job is to find things and is not wrapped up in the detail of performing the task; only how what can be seen and documented conforms to what it should look like. We know it isn’t wise to inspect our own work but don’t spend too much time thinking about why that is a truism or what we would need to do to make that less of an issue. In a “perfect” world, inspectors would find nothing needing additional attention. We tend to believe that having one additional set of eyes is enough for most QA functions. Is this ever wrong? Of course it is. This fact helps us also understand that paying attention to detail, although a very large part of the safety equation, will not eliminate serious errors; that it does is another myth often repeated in safety management. Death and destruction can still happen when the best safety system possible is in place because somebody was busy sending a text message while driving the train. The sterile cockpit concept was implemented by the airline industry because of this sort of scenario. The airlines required a “cultural” change because it was clear that to a large extent, individuals were not capable of paying attention to detail without an industry-wide push and creation of universally-accepted policy with the understanding that there were consequences if it was not followed strictly. That cultural change had to be initiated and enforced from the top management all the way to the cockpit, the same way the development of cockpit resource management (CRM) was not going to happen just because some pilots liked the idea.

Mechanics and Training

Law enforcement, medical professionals, fire fighters and pilots complete much training. The cost of this training is absorbed because we know it pays off when the skills are needed. Once an A&P mechanic gets the basic federal certificate, OEM training on the aircraft or simply training on how to do the job safely can become the exception rather than the rule. OEM aircraft factory courses are expensive. I’ve met many mechanics who’ve been maintaining aircraft for 10 or 15 years and who have never been to any OEM schools. Working on makes or models for which there has been no “formal training” is legal in the U.S. but not in many other countries.

Having an OEM instructor next to you and showing you how to perform a task is invaluable. Being shown jobs that are performed infrequently is a confidence booster when it is time to do them. On-the-job-training is great but there is no gauge for it. It can be better than OEM training in some cases or dismally bad in others. In both cases it would legally qualify a mechanic (in a non-repair station environment) to perform or supervise the performance of others in a critical task AND return an aircraft to service. No test or qualification checklist, no check ride and no oversight other than the person training you is required.

Why isn’t everyone OEM factory trained? Why is it so hard for some organizations, private or public, to justify increasing the amount of initial and recurrent training that is provided to mechanics? Some would say, “If you have an A&P, doesn’t that mean you have been trained?” Think about this — a certificated mechanic may not exercise the privileges of their certificate and rating unless they understand the current instructions of the manufacture and the maintenance manuals for the specific operation concerned (CFR 65.81(b)). Are there any other requirements? Within the last 24 months you must have worked actively for six months using your certificates (65.83). What does that entitle you to do? CFR 65.81 also says that a properly-rated mechanic can perform or supervise maintenance for which they are rated; however, they may not supervise someone else or return an aircraft to service unless they have satisfactorily performed the work at an earlier date. If they have done it before (there is no time limit on how long ago they may have satisfactorily performed this work) they can perform or supervise it. If they haven’t done it before they can perform it, but in order to supervise it, they would have to perform it by demonstration to the satisfaction of the administrator or to an appropriately-rated mechanic or repairman who had done this job before. (Again, no time limit as to when they did it last).

I’m all in favor of passing along information by “each one teach one” but now we have a potential for bad shortcuts, forgotten cautions or complacency resulting in the performance of the task being something other than what the manufacturer had in mind. This kind of training might be fine for much of what we do. Police, firefighters, airline pilots, nurses and many other occupations are regularly doing recurrent training or requalification (endorsed from the top down) but floor mechanics or field mechanics are often seen as being too busy making things go, and their time can’t be spared or there is no money for it. Besides, aren’t their errors often caught before an aircraft rolls out the door or takes off? We can only hope. Pilots have check airmen who meet written and approved qualifications for each company. There is no equivalent of a check airman for maintenance. There is only the requirement that if you have a repair station, you must have a training manual and a person responsible for training. In small shops the person who pulls the training trigger is likely the owner/CFO of the company and they have many issues on their plates besides knowing that more training is better.

Confidence

Confidence is another thing that everybody needs to have to do a job well. Who is likely to be more confident — someone who was trained on a helicopter by on-the-job training, in dribs and drabs and is now being pressured to finish a job, or someone who has had an investment made in them by OEM training and is being told at all levels that being safe and late is better than being on time and not being sure? Why do we ever light the fuse on the petard and hope we’ll be out of the way when it blows? It comes down to money, time and success. If we are airborne law enforcement, we want to be successful and complete the mission. Helicopter emergency medical service (HEMS) crews want to complete the mission and make money doing it. Aerial firefighters want to put out fires and save people and property. All of these occupations are full of “can-do” personalities. That is a good thing when the going gets tough but it is also potentially bad when pride, anger, envy or fear become what rules our decision making process.

Fatigue alone might not cause an accident, but being afraid to say no when pressured to get that engine change done is all that might be needed to make the accident investigator point to fatigue as the cause when the real cause was your being reluctant to stand down because of fear of consequences or inconvenience. I offer no immediate solution to this but I think it needs to be discussed more from a “how to fix it” point of view than from a “here’s the problem” perspective. Having a crew watch the latest SMS presentation of the “Dirty Dozen” human failings that can lead to maintenance errors is fine as long as the factors leading up to the errors are presented in real-world representations and offer some way you can actually avoid them. I am tired of hearing a rehashing of these items pawned off as recurrent training. It takes a commitment from management to allow a fatigued or stressed-out employee to have a system in place that allows them to stop working, and yet, we always talk about how fatigue is such a serious problem. Limiting duty time helps, but does not ensure people are rested and focused on the job at hand. Many times we are the only party responsible for coming to work fatigued.

Making sure people use written documentation is doubly difficult because people can read things and interpret them incorrectly or decide on their own that there is a better way and that the person who wrote that procedure must surely never have worked on a real helicopter. Promoting employee assertiveness is a “Dirty Dozen” item but it’s of no value or has a negative impact if management doesn’t listen attentively and respond appropriately. For the most part, failure to use documentation is seen as a time saver because we know how to do a particular job. The root cause for a maintenance error due to this is fear (of taking too much time or of having someone snicker because you are using the book instead of your natural-born talent of knowing everything there is to know) or laziness because you are comfortable and think you know everything you need and reminders are not necessary. It is a fine line, but failure to use written documentation doesn’t cause errors — rather, it is failure to complete the procedure according to what is written that causes the error and we shouldn’t risk not using reminders when the task is complex. We all forget stuff and need to remember that usually we don’t know better than the people who wrote the procedures.

Pressure Situations

Much of what pilots, police, fire and medical personnel must do involves time-critical decisions, making split-second decisions that could mean life or death or damage to an aircraft. The aviation mechanic rarely has to make these kinds of time-critical decisions and this might partly explain why it feels like we can do without the same regimen or intensity of training. For various reasons, though, we make compromises for getting something out the door. The pressure to meet a maintenance schedule comes from the top in any organization but we also have to deal with our own traditions and try and make sure that our own desire to “do good” doesn’t make us rush. The buy-in to an SMS program has to start at the highest level in a company and be part of everyday operations for all employees. The difficulty in meeting a deadline starts in the trenches because that is where the work gets done. These two forces have to meet on common ground and be understood by everyone.

“The movement from management to leadership, from fear to participation and from focusing on the self to the other are all elemental parts of the underlying paradigms that inspire the work of the organization,” says Daniel K. Judd, Ph.D. and assistant professor at Brigham Young University.

Heavy stuff, but please think about it the next time someone pokes fun at SMS being yet another layer of paperwork and lip service getting in the way of getting the job done. I know there are more and more converts but there are still plenty of folks who do not believe that implementing SMS is more important than having the document in place. SMS is really just an umbrella for what we should always be doing anyway.

Conclusions

Here are my conclusions and I’m sorry if they are a bit preachy. I have nothing to add to the Dirty Dozen; they are pretty much complete but they are only signposts. You have to take each one and say, “OK, how does this work or not work in my environment?” How we train and what we train for should not just be what is easiest to deal with. Psychological/behavioral issues are much more complicated than the procedures with which you try to address them (checklists, company policies, etc.). I think one important factor that will help ensure safety in the aircraft maintenance environment is to foster personal responsibility for yourself and everyone you work with. Avoid false pride. None of us are maintenance gods. Don’t be afraid to say, “I don’t know but I know where to look it up.” Humility, in my book, is far and away the most important trait I hope I have and that I appreciate in others. Know that you will forget something. Ask for help more often. Although I have stressed training and being taught, when it comes down to it, you have to be willing to learn. For Pete’s sake, use the checklist. Oh, and don’t forget to wear eye protection and clean the drill press after you use it.

Jon Robbins has been an A&P mechanic for 32 years. He holds IA and private pilot and instrument airplane ratings. He supervises the helicopter maintenance program at CAL FIRE. Robbins is also on the Helicopter Maintenance magazine advisory board. His e-mail address is jon.robbins@fire.ca.gov

About D.O.M. Magazine

D.O.M. magazine is the premier magazine for aviation maintenance management professionals. Its management-focused editorial provides information maintenance managers need and want including business best practices, professional development, regulatory, quality management, legal issues and more. The digital version of D.O.M. magazine is available for free on all devices (iOS, Android, and Amazon Kindle).

Privacy Policy  |  Cookie Policy  |  GDPR Policy

More Info

Joe Escobar (jescobar@dommagazine.com)
Editorial Director
920-747-0195

Greg Napert (gnapert@dommagazine.com)
Publisher, Sales & Marketing
608-436-3376

Bob Graf (bgraf@dommagazine.com)
Director of Business, Sales & Marketing
608-774-4901