Tag Archives: Cause Mapping

Fire at FAA Facility Sparks Flight Havoc

By Kim Smiley 

On Friday September 26, 2014, air traffic was grounded for hours in the Chicago region following a fire in a Federal Aviation Administration facility in Aurora, Illinois. The snarl of flight issues impacted thousands of travelers in the days following the fire as airports struggled to deal with the aftermath of more than 4,000 canceled flights and thousands more delayed.

A Cause Map, a format for performing a visual root cause analysis, can be used to analyze this issue.  To build a Cause Map, the first step is to define the problem by determining how the overall organizational goals are impacted.  In this example, there is a significant customer service impact because thousands of passengers had their travel plans disrupted. The flight cancelations and delays can be considered an impact to the production/schedule goal.  The amount of time and energy needed to address the flight disruptions along with the investigation into the issue would also be impacts to the labor goal.  Once the impacts to the goals are determined, the Cause Map is built by asking “why” questions and visually laying out the answers to show the cause-and-effect relationship.

Thousands of flights were canceled because air traffic control was unable to support them.  Air traffic control couldn’t perform their usual function because there was a fire in a building that provided air traffic support for a large portion of the upper Midwest and it wasn’t possible to quickly provide air traffic support from another location. Focusing on the fire itself first, the fire appears to have been intentionally set by a contractor who worked in the building.  He was able to bring in flammable materials and start a fire without anyone stopping him.  Police are still investigating his motives, but he has been charged with a felony. The building was evacuated once the fire was discovered and employees obviously couldn’t perform their usual duties during that time.  Additionally, the fire damaged equipment so air traffic control functionality could not be quickly restored once the initial crisis was addressed and it was safe to return to the building.

The second portion of the issue is that there wasn’t a way to support air traffic once the building was evacuated.  Once the fire occurred, all flights were grounded because there wasn’t air traffic control support and it was not possible to quickly get air traffic moving again.

The final step in the Cause Mapping process is to develop and implement solutions to reduce the risk of a similar problem.  Law makers have called for an investigation into this issue to see if there is sufficient redundancy in the air traffic control system.  In an ideal situation, a fire or other crisis at any single location would not cripple US air traffic to the extent that this issue did.  The investigation is also looking into the fire and reviewing the security at the facility to see if there should be stricter restrictions put in place, such as ensuring that no employees work alone or searching bags as workers access the site.

This situation is also a strong reminder that organizations need to have a plan in place of what to do in case a failure occurs.  There was a previous fire scare at this same location earlier in 2014 when a smoking ceiling fan resulted in an evacuation and flight delays (see previous blog) that should have prompted some serious consideration of what the contingency plan should be if this facility was ever out of commission.

I was one of those people standing in line for hours at an airport on Friday morning after my flight was canceled.  And I for one would love to see the air traffic control system become more robust and better able to deal with the inevitable hiccups that occur.  It’s impossible to prevent every potential problem and another intentional fire in a FAA facility seems pretty farfetched, but it is possible to have a better plan in place to deal with issues that may arise.  The potential consequences of any single failure can be limited with a good plan and quick implementation of that plan.

Can Airline Seats Get Even Smaller?

By Kim Smiley

Was the experience the last time you flew wonderful?  Did you enjoy all the luxurious amenities like ample elbow room, stretching out your legs, and turning around in the bathroom?  Me neither.  Comfort certainly hasn’t been the top priority as airlines have shrunk seats to cram more passengers onboard, but a new patent application by Airbus really takes things to a whole new level.

They say that a picture is worth a thousand words and I think that is particularly true in this case.  This is a diagram of a patent application for a proposed seat design –

 

I’m not sure about the rest of you, but my backside is sore just thinking about an airplane seat that bears such a strong resemble to a bicycle.

I attempted to build a Cause Map, a visual root cause analysis, in order to better understand how such a design could be proposed because I frankly find it mind-boggling.  The basic idea is that airlines would like to maximize profits and that putting more people on each flight allows more tickets to be sold resulting in more money made.  The average airline seat width has already decreased to about 17 inches from the 18 inches typical for a long-haul airplane seat in the 1970s and 1980s.  Compounding the impact on passengers is the fact that the average passenger has increased during that same time frame.  In general larger bodies are being put in smaller seats, not a recipe for a comfort.

I’m still having a hard time understanding how the correct answer to increasing airline profits is making seats even smaller.  I have to believe that passengers will balk at some point.  At some level of discomfort, a cheap ticket just won’t be cheap enough for me to be willing to endure a truly awful flight.  Even with electronic distractions and snacks, there has to be a point where people would just say no.

There also has to be a number of safety concerns that arise when the size of airplane seats is dramatically decreased.  Survivability in a crash is greatly influenced by seat design because airplane seats are designed to absorb energy and provide head injury protection during an accident.

Just to be clear, there is no plan to actually use this seat design anytime in the near future.  This is just a patent application.  As Airbus spokeswoman, Mary Anne Greczyn said, “Many, if not most, of these concepts will never be developed, but in case the future of commercial aviation makes one of our patents relevant, our work is protected. Right now these patent filings are simply conceptual.” But somebody somewhere still thought this was a good enough idea that it should be patented…just in case.

Children Served Bleach from Reused Milk Jug

By ThinkReliability Staff

For morning snack on September 11, 2014, a substitute teacher’s aide was getting ready to pour water for snack on her first day on the job. Unfortunately, what she poured from a reused plastic milk container was actually a beach solution used for cleaning. The mistake was realized quickly, but not before 28 children and 2 adults ingested some of the bleach. Luckily the concentration was low enough that there were no injuries, although all who ingested the solution were seen at a local hospital.

The substitute teacher’s aide was fired and the school reopened the next day, though the New Jersey Department of Children and Families will be investigating. Clearly serving cleaning solution to children under your care is undesirable. However, firing the person most directly involved without fixing any of the issues that contributed to the mistake may leave an unacceptable risk for the issue to happen again. Although this appeared to be the first time anything like this happened on such a scale in a day care facility, the misuse of cleaning fluid due to confusing containers has happened before. Just this July a woman was given an epidural of cleaning fluid after containers were accidentally switched. (See our blog to learn more.)

Identifying the impacted goals and all the causes that led to those impacted goals allows for more solutions than just firing the person found to be most immediately responsible. The use of a Cause Map, a visual form of root cause analysis, diagrams all the cause-and-effect relationships in order to develop as many solutions as possible so the most effective among them can be implemented.

First the impacts to the goals are identified. The safety goal is impacted because of the potential for injury to the 28 children and 2 adults who drank the bleach solution. The bleach solution was stored in a food container, which can be considered an impact to the environmental goal. The customer service goal is impacted because the children and adults were served bleach solution. The day care worker being fired, and the ongoing investigation by the licensing agency, can both be considered impacts to the regulatory goal. Additionally, the treatment of all 30 who ingested the solution impacts the labor goal.

Beginning with one impacted goal, we ask “why” questions to determine cause-and-effect relationships. In this case, the safety goal impact of potential injury is due to the children and teachers drinking the bleach solution they were served. The bleach solution was served by the fired employee who was apparently unaware that the milk jug actually stored bleach solution. The executive director indicated that the jug was labeled, so this is apparently not an uncommon practice at the site. The question this raises is, why was an old milk jug used to store cleaning solution?

The American Association of Poison Control Centers (AAPCC) says: “DO NOT use food containers such as cups or bottles to store household and chemical products” and “Store food and household chemical products in separate areas. Mistaking one for the other could cause a serious poisoning.” Although the reused container was apparently labeled (though not clearly enough to avoid the mistake), it should never have been reused in the first place. As indicated by the AAPCC, reusing containers between food and cleaning supplies is just too big of a risk. It’s also worth noting that reusing a bottle that contained household chemicals for a different household chemical is another no-no: “Never mix household chemical products together. Mixing chemicals could cause a poisonous gas.” Don’t run the risk at your workplace or home. Don’t reuse food containers for cleaning products or mix cleaning products.   Fortunately the children at this day care center got off without lasting damage in this case.

App Takes Down National Weather Service Website

By Kim Smiley

The National Weather Service (NWS) website was down for hours on August 25, 2014.  Emergency weather alerts such as tornado warnings were still disseminated through other channels, but this issue raises questions about the robustness of a vital website.

This issue can be analyzed by building a Cause Map, a visual format for performing a root cause analysis.  Cause Maps are built by laying out all the causes that contributed to a problem to show the cause-and-effect relationships.  The idea is to identify all the causes (plural), not just THE one root cause.

This example is a good illustration of the potential danger of focusing on a single root cause.  The NWS website outage was caused by an abusive Android app that bogged the site down with excessive traffic.  The app was designed to provide current weather information and it pulled data directly from the forecast.weather.gov website.  The app inadvertently queried the website thousands of times a second because of a programming error and the website was essentially overwhelmed.  It was similar to the denial of service attacks that have been directed at websites such as Bank of America and Citigroup, but the spike in traffic in this case wasn’t deliberate.

It may be tempting to say that the app was the root cause. Or you could be more specific and say the programming error was the root cause.  But labeling either of these “the root cause” would imply that you solved the problem once you fix the software error. The root cause is gone, no more problem…right?  In order to address the issue, NWS installed a filter to block the excessive queries and worked with app developer to ensure the error was fixed, but there are other factors that must be considered to effectively reduce the risk of a similar problem recurring.

One of the things that must be considered in this example is why a filter that blocked denial of service attacks wasn’t already in place.  Flooding a website with excessive traffic is a well-known strategy of hackers.  If an app could accidently take the site down for hours, it is worrisome to consider what somebody with malicious intent could do.  The NWS is responsible for disseminating important safety information to the public and needs a reasonably robust website.  In order to reduce the impact of a similar issue in the future, the NWS needs to evaluate the protections they have in place for their website and see if any other safeguards should be implemented beyond the filter that addressed this specific issue.

If the investigation was focused too narrowly on a single root cause, the entire discussion of cyber security could be missed.  Building a Cause Map of many causes ensures that a wider variety of solutions are considered and that can lead to more effective risk prevention.

To view a high level Cause Map of this issue, click on “Download PDF” above.

Ice Bucket Challenge Ends in Serious Injuries

By Kim Smiley

In a terrible reminder that awful things can happen at any time, two firefighters were seriously injured helping the Campbellsville University’s marching band raise money for amyotrophic lateral sclerosis (ALS) research by participating in the trendy ice bucket challenge.  If you ever log onto Facebook, you are probably already familiar with the concept behind the ice bucket challenge, but in case you are not a social media fan, the idea behind the ice bucket challenge is that friends tag each other to either donate $100 to an ALS-related charity  or dump a bucket of ice water over their head.  If you choose the ice bucket, you are supposed to take a video or photo as evidence and post it online.

Trying to create an entertaining video of the ice bucket dumping is part of the fun for many of the participants.  In order to make a memorable video to post on social media, the firefighters that were injured used a fire truck ladder to dump ice water on the band from above.  While on the ladder, the firefighters were near high voltage power lines (although they never actually touched the lines) and electricity arced out, injuring four firefighters.  Two firefighters were treated and released, but two were still hospitalized days later.  One was listed as stable, but the other was in critical condition.

This accident clearly illustrates that high voltage can be extremely dangerous even if you don’t touch the equipment. An arc flash can occur when a flashover of electric current leaves its intended path and travels through the air from one conductor to another or to the ground.  The closer a person is when an arc happens, the more dangerous it is.  Arcs are exceptionally hot and can cause very serious injuries and even death from several feet away when high voltage is in use.

The Public Service Commission stated that they will investigate the location to ensure that the power line had the correct clearance from the ground, trees and structures, but initial reports do not indicate any problems with the power poles.  Possible solutions that could be used to reduce the risk of a similar problem in the future are increased education on the risks of high voltage and ensuring that adequate warning signs are in place.

These have been the most dramatic injuries associated with the ice bucket challenge, but there are a slew of videos featuring buckets dropped on heads, slips and a variety of other unintended outcomes that look painful.  If you are considering doing the ice bucket challenge, please remember that a gallon of water weighs over 8 pounds.  A five gallon bucket filled with water is pretty heavy.  Think the plan through carefully before you ask somebody to dump water on you off a balcony because it may end badly.

Freight Trains Collide Head-On in Arkansas

By Kim Smiley

On August 17, 2014, two freight trains collided head-on in Arkansas, killing two and injuring two more.  The accident resulted in a fire after alcohol spilled from a damaged rail car ignited, prompting evacuation of about 500 people from nearby homes.  The trains were carrying toxic chemicals, but none of the cars carrying the toxic chemicals are believed to have been breached during the accident.

The National Transportation Safety Board (NTSB) is currently investigating this accident, but an initial Cause Map, or visual root cause analysis, can still be built to help document and illustrate the information that is known.  One of the benefits of a Cause Map is that it can easily be expanded to incorporate information as it becomes available.  The first step of the Cause Mapping process is to fill in an Outline with the basic information for an incident.  In addition, anything that was different at the time of accident is listed.  How the incident impacts the overall goals is also documented on the bottom of the Outline.

Like many incidents, there are a number of goals that were impacted by this train collision.  The safety goal is obviously impacted by two fatalities and injuries.  The property goal is impacted because of the significant damage to the trains and freight.  The labor/time goal is impacted because of the response effort and investigation that are required as a result of the accident. Potential impacts or near misses should also be documented so the potential release of toxic chemicals is considered an impact to the environmental goal.

The second step is to perform the analysis by building the Cause Map.  To build the Cause Map, start with one impacted goal and ask “why” questions.  Each answer is added to the Cause Map.  Each impacted goal should be considered and the cause boxes should all connect at some location on the Cause Map.  Starting with the safety goal in this example, the first question would be: why were two people killed?  This occurred because there was a train collision.  The trains collided because they were traveling toward each other on the same track.  No details have been released about how the trains ended up on the same track.  The trains’ daily recorders (which provide information about the trains’ speed, braking and throttle) have been found and will be analyzed by investigators. The NTSB has stated that they will be looking into a number of factors such as the train signals and fatigue since the accident occurred late at night.

The final step in the Cause Mapping process is to develop solutions that can be implemented to reduce the risk of a similar problem recurring in the future.  Since the investigation is ongoing, talk of solutions is premature at this point.  Once more is known about the causes that contributed to this issue, the lessons that are learned can be used to develop solutions.

Software Glitch Delays U.S. Travel Documents

By Kim Smiley

The Consular Consolidated Database (CCD) is the global database used by the U.S. State Department to process visas and other travel documents.  On July 20, 2014, the CCD experienced software issues and had to be taken offline.  The outage lasted several days with the CCD being returned to service with limited capacity on July 23.  The CCD is huge, one of the largest Oracle-based warehouses in the world, and is used to process a hefty number of visas each year and the effects of the software glitch have been felt worldwide.  The State Department processed over 9 million immigrant and non-immigrant visas overseas in 2013 so a delay of even a few days means a significant backlog.

This issue can be analyzed by building a Cause Map, a visual root cause analysis.  A Cause Map visually lays out the different causes that contribute to an issue so that the problem is better understood and a wider range of solutions can be considered.  The first step in the Cause Mapping process is to define the problem, which includes documenting the overall impacts to the goal.  Most problems impact more than one goal and this example is no exception.

The customer service goal is clearly impacted because thousands – and potentially even millions – have had their travel document processing delayed.  The negative publicity can also be considered an impact to the customer service goal because this software glitch isn’t doing the international image of the U.S. any favors.  The delay in travel document services is an impact to the production/schedule goal and the recovery effort and investigation into the problems impact the labor/time goal.  Additionally, there are potential economic impacts to both individuals who may have had to change travel plans and to the U.S. economy because these issues may discourage international tourism.

The next step in the Cause Mapping method is to build the Cause Map.  This is done by asking “why” questions and using the answer to visually lay out the cause-and-effect relationships.  The delay in processing travel documents occurred because the CCD is needed to process them and the CCD had to be taken offline as a result of software issues.  Why were there issues with the database? Maintenance was done on the CCD on July 20 and the performance issues began shortly thereafter.  The maintenance was done to improve system performance and to fix previous intermittent performance issues. The State Department has stated that this was not a terrorist act or anything more malicious than a software glitch.  An investigation is currently underway to determine exactly what caused the software glitch, but the details have not been released at this time.  It can be assumed that the test program for the software was inadequate since the glitch wasn’t identified prior to implementation.

The final step in the Cause Mapping process is to identify solutions that can be implemented to reduce the risk of a problem recurring.  Details of exactly what was done to deal with the issue in the short term and bring the CCD back online aren’t available, but the State Department has stated that additional servers were added to increase capacity and improve response time.  There is also a plan to improve the CCD in the longer term by upgrading to a newer version of the Oracle database software by the end of the year which will hopefully prove more stable.

To view an Outline and high level Cause Map of this issue, click on “Download PDF” above.

Extensive Fire on USS George Washington Placed Crew at Risk

By ThinkReliability Staff

When fire broke out in 2008 on aircraft carrier USS George Washington in an unmanned space that was being used to improperly store flammable materials, it took more than 8 hours to find the source of, and extinguish, the fire. In the Navy’s investigation report, Admiral Robert F. Willard, commander of the US Pacific Fleet, stated “It is apparent from this extensive study that there were numerous processes and procedures related to fire prevention and readiness and training that were not properly functioning. The extent of damage could have been reduced had numerous longstanding firefighting and firefighting management deficiencies been corrected.”

The processes and procedures that were implicated in the investigation of the fire can be examined in a Cause Map, or a visual root cause analysis. This process begins by identifying the goals impacted. In this case, the primary goal impacted was the safety goal. Thirty-seven sailors were injured; one was seriously burned. There were no fatalities. In addition, the damage to the ship was estimated at $70 million and left the ship unusable for 3 months.

Beginning with the impacted safety goal, asking ‘Why’ questions allows us to develop the cause-and-effect relationships that led to those impacted goals. In this case, the injuries to sailors resulted from the extensive fire aboard ship. In addition, some of the affected sailors (including the sailor who was seriously burned) did not have adequate protective clothing. Specifically, liners worn underneath firefighting gear were not available in one repair locker because they were being laundered. Both the fire and the inadequate protective gear were causally related to the injuries so they are both included on the Cause Map and joined with ‘and’.

Asking additional ‘why’ questions adds more detail to the Cause Map. When investigating a fire, it’s important to include the factors that resulted in the initiation of the fire (heat, fuel and oxygen) as well as those that allowed the fire to spread. In this case, the ignition (or heat) source was believed to be a cigarette butt. On-scene evidence showed that smoking was occurring in the area, against regulation. The ship was found to have inadequate training regarding the smoking policy and inadequate control over the locations where smoking was occurring, because regular zone inspections were not being held.

The initial fuel source was determined to be refrigerant oil and other flammable materials improperly stored in an unmanned space where the fire began. The oil was not turned in as required by procedure over a concern about the difficulty of retrieving it. Because the oil was never entered into the inventory control system, the storage discrepancy was not noted. The unmanned space in which it was stored was not inspected. Unmanned spaces were not included in zone inspections and the area had not been designed as a tank or void to be identified in the void and tank inspection.

Once a fire breaks out, the speed in which the source is found and extinguished has the most impact on the safety of personnel. In this case, the source of the fire was not found for eight hours.   Not only did the fire begin in an unmanned area, the drawings showing the layout of the ship were inaccurate, because the ship was in the midst of alterations.

Developing the causes the resulted in the impacted goals allows for identification of all the processes and procedures that need to be re-examined to reduce risk of recurrence. In this case, the report identified multiple processes and procedures that were re-evaluated in the wake of the disaster, including those for hazardous material storage, training, inspection and firefighting.

To learn more, click here to read the Navy investigation report. To view a one-page overview of the Outline and Cause Map, please click on “Download PDF” above.

Can a “Super Banana” Reduce Vitamin A Deficiency?

By Kim Smiley

Vitamin A deficiency is rare in developed countries, but it remains a major public health issue in more than half of all countries, particularly in especially in Africa and South-East Asia. Researchers at the Queensland University have created a “super banana” genetically engineered to contain alpha- and beta-carotene that they hope will reduce vitamin A deficiency in parts of the world where bananas are a staple crop.

The problem of vitamin A deficiency can be analyzed using a Cause Map, a visual format for performing a root cause analysis. A Cause Map is built by determining how an issue impacts the overall goals and then asking “why” questions and laying out the answers visually to show the cause-and-effect relationships. In this example, the overall goal of public safety is impacted because vitamin A deficiency causes 650,000 – 700,000 deaths and results in blindness in 250,000-500,000 children annually. This occurs because the body, especially growing bodies, needs vitamin A to function properly and the diet does not contain adequate vitamin A.

Bodies use vitamin A in a number of ways. For example, vitamin A is important for healthy vision and a lack of it will result in blindness.  It has been shown to play an important role in the immune system. Diets in some regions of the world lack enough vitamin A because they are poor subsistence-farming communities that predominantly consume locally grown crops and the local crops don’t contain sufficient vitamin A.

There have been a number of different ways to help reduce the occurrence of vitamin deficiency such as distribution of vitamins and introduction of new crops, but the problem of vitamin deficiency is still a widespread issue which led to the idea of genetically modifying local crops to be more nutritious. The idea behind the “super banana” is that they would look the same as other East African Highland bananas and grow in the same conditions, but that they would be enriched with additional nutrients. The inside of the “super bananas” is more orange than regular East African Highland bananas, but the outside looks the same.

Lab tests with gerbils have been successful and the first human trials of the modified bananas are scheduled starting this summer. If the human trials are successful, the next necessary step is for Uganda’s legislature to approve a bill allowing the crops to be grown. Researchers are hoping to have the modified bananas growing in Uganda by 2020 if the government approves the project.

To view a high level Cause Map, click on “Download PDF” above.

Fingertips Amputated After Slip on Ice

By ThinkReliability Staff

Information on a slip that caused severe damage to an electrical contractor in Newcastle in August 2013 was recently released by Great Britain’s Health and Safety Executive (HSE). Though this incident didn’t make the front pages of the newspaper, it is representative of many of the injury investigations which we facilitate using the Cause Mapping method.

The first step in the Cause Mapping method of root cause analysis is to capture the what, when and where of the incident and the impacts to the organizational goals. In this case, the what (contractor slip and hand injury), when (August 30, 2013) and where (a moving conveyor at a baguette manufacturer in Leeds) are captured, as well as any differences and the task being performed at the time of the incident. There were two notable differences during the incident as compared to an “average” day that should also be noted: the safety guard had been removed from the conveyor and ice had accumulated on the floor. These differences may or may not be causally related to the incident. Additionally, the task being performed (cleaning up after contract electrical work) is captured as it, too, may be causally related to the incident.

The impacts to the goals are analogous to what stood in the way of a perfect day. A serious injury involving the partial amputation of two fingers and the injury of a third is an impact to the safety goal in this example. The £8,500 fine levied by the HSE is an impact to the regulatory goal. The worker had four weeks off work due to the injury, which is an impact to the labor goal. It is unclear if any other goals were impacted by this incident.

Once at least one impact to the goals has been determined, asking “why” questions helps us complete the second step, or analysis. In the analysis, we capture cause-and-effect relationships that map out the issues that led to the incident. In this case, the injury was caused by the contractor’s hand striking an unprotected drive chain on a moving conveyor. This occurred because the hand struck the area, the drive chain was unprotected, and the conveyor was moving. All three of these causes had to occur for the resulting injury.

The contractor’s hand struck the area because of a slip on an icy floor. Ice from an open freezer door (which appeared to be malfunctioning) had built up and had not been removed.   The drive chain was unprotected because the safety guard had been removed from the conveyor, which was moving likely due to normal operations.

According to Shuna Rank, the HSE inspector, “This worker’s injuries should not and need not have happened. This incident was easily preventable had Country Style Foods Ltd ensured safety guards were in place on the machinery. The company should also have taken steps to prevent the accumulation of ice on the freezer floor. Guards and safety systems are there for a reason, and companies have a legal duty of care to ensure they are properly fitted and working effectively at all times. Slips and trips are the biggest cause of major injuries in the food and drink industry with 37% of all major accidents in the industry being as a result of slips.”

The inspector’s quote clearly identifies the areas for improvement that could reduce the risk of similar incidents occurring. Namely, the manufacturer must ensure that damage resulting in ice buildup is fixed as soon as possible and that in the meantime, ice is regularly cleared away and the area is marked as a slip hazard. If a safety guard is removed for any reason, the conveyor should not be operating until it has been replaced properly. Ensuring that equipment is in proper working order is essential to reduce the risk to workers such as the injuries demonstrated in this case.

To view the Outline and Cause Map, please click “Download PDF” above. Or click here to read more.