Tag Archives: Cause Mapping

More Info about Deadly Mine Explosion

By Kim Smiley

Around 3 pm on April 5, 2010 in Montcoal, West Virginia, a huge explosion rocked the Upper Big Branch South mine killing 29 (Click here to read previous blog on the topic).  The toxic gas concentration in the mine remained so high after the accident that Mine Safety and Health Administration investigations were not able to enter the mine for more than two months after the accident.  The final report is still two to three months away, but the MSHA has developed a working theory on what caused the mine explosion.

According to a recent NPR article, investigators believe they have found the source of the spark that started the chain of events that lead to the massive mine explosion.  A longwall mining machine was in operation inside the mine, creating sparks as it ate through both coal and sandstone.  Sparking may have been worse than usual because investigators found that the carbide tipped teeth on the machine were worn down so that bare metal was contacting the stone and coal.

Sparks are expected during these types of operations so a water sprayer system is typically used to prevent explosions from occurring, but investigations found the water system in Upper Big Branch was not functioning properly.  Additionally, a properly functioning water spray system would help control the amount of coal dust in the air.  Coal dust is an accelerant, which means it will contribute to an explosion if ignited.

Another cause of this accident is the level of methane gas in the environment.  The Upper Big Branch South mine is a particularly gassy mine that naturally emitted high levels of methane gas.  There are still some open questions about the role ventilation may have played in the accident.

Small ignitions of methane gas are not uncommon in coal mines, but large explosions are rare.  According to data collected by Mine Safety and Health News, about 600 ignitions have occurred in the past 10 years without any major mine explosions occurring.

Coal mining involves managing a tricky combination of coal dust, methane and sparks.  Usually, no one gets hurt, but in this case the mixture resulted in a massive explosion that traveled more than two miles inside the mine and claimed the lives of 29.  Performing a thorough root cause analysis can help investigators understand what was different in this case and hopefully help the lessons learned be applied to other mines.

As more information comes available, the Cause Map can be expanded to include all relevant details.  Click “Download PDF” above to view the intermediate level Cause Map for this example.

Why Don’t All School Buses Have Seat Belts?

By Kim Smiley

Nearly every state in the US has a law requiring seat belts to be worn in cars. The lone state that doesn’t require adults to wear seat belts, New Hampshire, still has a law requiring children under 18 to wear seat belts.

Currently, only 6 states require seat belt in school buses.  The federal government does not require seat belts to be in installed in buses weighing over 10,000 lbs.  The regular school buses that make up 80 percent of the buses in this country exceed this weight limit and most do not have seat belts.

So if seat belts are required by law in cars, why don’t all school buses have seat belts?

Like most engineering problems, this isn’t as simple a question as it first appears.  The main reason that seat belts aren’t required on all buses is that buses are fundamentally different from cars.

School buses are heavier and taller than cars.  During an accident, a passenger on a bus experiences less severe crash forces than an occupant of a passenger car.  The interior of a modern school bus is designed to protect passengers passively through something called compartmentalization.  The seats are strong, closely-spaced, high backed, and covered in 4 inch thick foam to absorb energy.  The passenger is protected by the cushioned compartment created by the seats.

Buses are considered to be the safest form of ground transportation.  According to the National Highway Traffic Safety Administration, buses are approximately seven times safer than passenger cars or light trucks.

But would seat belts make them even safer?

This is subject to debate.  There are groups pushing for the federal government to require seat belts on all buses.  Others believe that the potential for misuse and incorrectly worn seat belts would actually result in a higher risk to safety if seat belts were installed.  There are also practical considerations like finding funding in cash strapped budgets to install seat belts and to buy the extra buses that would be necessary since fewer students can be accommodated on a bus with seat belts than one without.

There are few topics touchier than the safety of children and no clear cut answers to the question of what constitute a design that is safe enough.  It could be useful when dealing with a problem like this where emotions might run high to document all information in a Cause Map.  A Cause Map is a visual root cause analysis that incorporates the information associated with an issue in an easy to read format.  All pertinent evidence and facts associated with the topic can be recorded.  Having the same facts available to all invested parties can help keep the discussion production and uncover the best solutions.

To learn more about school bus safety, please visit the National Transportation Safety Board website and National Highway Traffic Safety Administration website.

Metrodome Collapsed

By Kim Smiley

At about 5 am in the morning on Sunday, December 12, 2010, the roof of the Metrodome collapsed under the weight of snow accumulated during the heaviest snow storm in almost two decades.  According to the National Weather Service, Minneapolis received a whopping 17.1 inches of snow between Friday and Saturday night.

The Metrodome is home to the Minnesota Vikings and its collapse set off a multicity scramble as the NFL worked to reschedule the Monday night game between the Vikings and the Giants that was planned to take place in the Metrodome on December 13.  After considering all the options, the game was moved to Detroit.  (Ironically, this was the first Monday night game played in Detroit in a decade because of the Detroit Lions’ abysmal record.)

Despite some early optimism, the latest update is that repairs will not be completed until March. The damage to the Metrodome moved the last two games of the Vikings’ season and will impact the schedule of about 300 college baseball games along with many other events planned in the venue.  In addition to the massive schedule impact, the cost associated with the repairs will be significant.

Why did this happen?

A Cause Map can be started using the information that is known.  To build a Cause Map, begin with the impacted goals and add Causes by asking why questions.  In this case, the impacted goals considered are the Production-Schedule goal and the Safety goal.  Fortunately, there were no injuries during the collapse, but the impact to this goal is included because of the potential for injuries if the Metrodome collapsed while occupied.  Click on the “Download PDF” button above to see the initial Cause Map built for this example.

The Metrodome design includes an inflatable dome to protect the venue from the harsh Minnesota winters.  The massive amount of snow accumulation on the dome after the severe storm exceeded the capacity of the dome to stay inflated.  The dome is made of two layers of materials (the outside layer is Teflon coated fiberglass and the inner layer is made from a proprietary acoustical fabric) and air is constantly pumped into the space between the layers to keep it inflated.  The massive weight of the snow tore the roof in several places and it collapsed.

The high winds that accompanied the snow fall were also one of the causes contributing to this accident.  When there are heavy snow falls, workers typically climb on the roof of the Metrodome and use steam and high powered hot water hoses to melt snow and limit accumulation.  Workers were unable to access the roof due to safety concerns because of the strong winds.  Additionally, the other measures used to prevent accumulation were inadequate.  These measures include pumping hot air into the dome and heating the stadium to about 80 degrees to help melt snow.

To view a video of the Metrodome collapsing from inside dome click here.

Printing Issues with New $100 Bill

By ThinkReliability Staff

In October, the U.S. government discovered that some of the newly redesigned $100 bills were coming off the printing press with blank spots caused by creases in the paper at both sites of the Bureau of Engraving and Printing, Washington, D.C. and Fort Worth, Texas.  The government has recently announced that this will cause a delay in the introduction of these bills, planned for the spring of 2011.

Additionally, the bills that have blank spots will have to be  shredded and reprinted.  Because of complex new security features aimed at deterring counterfeiters (such as a 3-D security strip woven into the paper), the bills cost $0.12 to print.  Hundreds of millions of bills have been printed, with a possible cost of this issue in the millions of dollars.

 Although issues with currency are expensive, they’re also rare. The last time that a printing issue caused a delay in the introduction of a new bill was 1987.  It’s unclear at this point when the bills will finally be released.

It’s also unclear what happened to cause the paper to crease, creating blank spots from printing.  The additional complexity of this bill with the additional security features is being looked at, as are issues with the paper and the printing machines.  However, because similar errors occurred at both printing sites, it’s unlikely that there is a specific issue with just one site’s machines.  Although the investigation into what caused the blank spots is ongoing, we can begin a root cause analysis with what is currently known.  Once more information is discovered, the Cause Map can be updated.

Because of the high potential financial losses from this issue, the eventual investigation will likely go into great  detail and to determine fully what happened will take some time.  The Cause Map and outline for the information known now can be viewed by clicking “Download PDF” above.

Space Shuttle Launch Delayed

By ThinkReliability Staff

Launching a space shuttle is a complicated process (as we discussed in last week’s blog).  Not only is the launching process complex, finding an acceptable date for launch is also complex.  This was demonstrated this week as the shuttle launch was delayed four times, for four separate issues and now will not be able to happen until the end of the month, at the earliest.

There are discrete windows during which a launch  to the International Space Station (which is the destination of this mission) can occur.  At some times, the solar angles at the International Space Station would result in the shuttle overheating while it was docked at the Space Station.  The launch windows are open only when the angles are such that the overheating will not occur.

The previous launch window was open until November 5th.  The launch was delayed November 1st for helium and nitrogen leaks, November 2nd for a circuit glitch, November 4th for weather, and November 5th for a gaseous hydrogen leak.  After the November 5th delay, crews discovered a  crack in the insulating foam, necessitating repairs before the launch.  These delays pushed the shuttle launch out of the available November launch window.  The next launch window is from December 1st through 5th, which gives the shuttle experts slightly less than a month to prepare for launch, or the mission may be delayed until next year.

Although not a lot of information has been released about the specific issues that have delayed the launches, we can put what we do know into a Cause Map.  A thorough root cause analysis built as a Cause Map can capture all of the causes in a simple, intuitive format that fits on one page.  Once more information is released about the specifics of the issues that delayed the launch, more detail can easily be added to the Cause Map to capture all the causes for the delay.  Additionally, the timeline can be updated to reflect the date of the eventual launch.

To view the problem outline, Cause Map, and launch timeline, please click on “Download PDF” above.

Mine Deaths in China

By ThinkReliability Staff

Following the successful rescue of all 33 miners trapped in a Chilean mine is some unhappy mine news from China.  A gas blast on October 16, 2010 in the early morning is known to have killed 26 miners, and the 11 miners unaccounted for are believed dead.   In addition to these impacts to the safety goals, the environmental goal is impacted by the extremely high levels of methane gas, the customer service and production goals are impacted by the closure of the mine, and the property and labor goals are impacted by the rescue efforts that have been required.  Unfortunately this is not an uncommon occurrence.  It is estimated that 2,600 people were killed in Chinese mine accidents last year.

It is expected that the miners were mostly killed due to suffocation.  In addition to the lack of oxygen from the extremely high levels of methane (40% compared to the normal level of 1%), the miners were buried by coal dust, released by the gas blast.  The miners were trapped in the mine by the gas blast, of which the cause is as of yet unknown.  This is a question that additional investigation will try and answer.  Additionally more information is needed about the high levels of methane.  The rescuers had difficulty reducing the levels of methane because coal dust was blocking an access shaft, but levels were high prior to the blast, for reasons that are unclear.

More detail can be added to this Cause Map as the analysis continues. As with any investigation the level of detail in the analysis is based on the impact of the incident on the organization’s overall goals.  Because of the high number of deaths (and the high frequency of this type of incident), the Cause Map should end up very detailed in order to provide as many solutions as possible to ensure that the best solutions are implemented to reduce these types of incidents.

Toxic Red Sludge Spill

By Kim Smiley

On Monday, October 4, 2010, a massive wave of red sludge flooded into four villages near Kilontar, Hungary when a storage reservoir burst.  Four were killed and at least 150 have needed medical treatment for their injuries.  The most common injuries reported are burns and eye ailments.

Red sludge is a highly caustic material that is produced during the aluminum manufacturing process.  Reports indicate that the sludge had a pH of 13 while stored in the reservoir.  All life has been killed in a 25 mile stretch of river and 16 square miles of land have been covered by the pollution.  Best estimates are that 158 million to 184 million gallons of sludge were released.  This first large scale release of red sludge in history.

Hungary’s top investigative agency is looking into the accident, but the cause for the reservoir barrier failure is not known at this time.

Even with the unknowns, a root cause analysis can be started by creating a Cause Map and documenting all available information.  Any new information can easily be incorporated into the existing Cause Map.

To build a Cause Map, we start with the impacted goals and ask “why” questions.  In this example, the two goals we will consider are the Safety goal and the Environmental goal.  Starting with the Safety goal we begin by asking – Why were people injured?  They were injured because they were exposed to caustic material because red sludge flooded into their villages.  Why?  Because red sludge was stored in a nearly reservoir and the barrier on the reservoir was breached.

Why the barrier failed isn’t known, but we can still add additional information that might be useful.  We know that the red sludge reservoir was near the villages and a little research reveals that this is common practice in the region and that there are a number of similar pools nearby.  This information may become relevant if the investigation determines that the other reservoirs are at risk for a similar failure so it’s worth recording on our Cause Map at this point. There is also information available about the environmental impact that can be added.

The investigation is still incomplete, but the Cause Map can grow as more information comes available.  Once the relevant information is added, the Cause Map can be used to develop solutions to help prevent similar accidents from occurring in the future.

Dig Deeper to get to the Causes of the Oil Spill

By ThinkReliability Staff

On Sunday (September 26th, 2010) the lead investigator for the Deepwater Horizon oil spill was questioned by a National Academy of Engineering committee.  The committee brought up concerns that the investigation that had been performed was not adequate to address all the causes of the spill.  Said the lead oil spill investigator: “It is clear that you could go further into the analysis . . . this does not represent a complete penetration into potentially deeper issues.”

Specifically, the committee was concerned that the study focused on decisions made on the rig (generally by personnel who worked for other companies) but did not adequately consider input from these companies.  The study also avoided organizational issues that may have contributed to the spill.

In circumstances such as this one – where an extremely complicated event requires an organization to spend most of its resources fixing the immediate problem, an interim report – which may not delve deeply into underlying organizational issues or obtain a full spectrum of interviews – may be appropriate.  However, it’s just an interim report and should not be treated as the final analysis of the causes relating to an issue.  The organizations involved need to ensure that after the immediate actions – stopping the spill, completing the cleanup, and compensating victims – are complete, an in-depth report commensurate with the impact of the issue is performed.

In instances such as these, causes relating to an incident need to be unearthed ruthlessly and distributed freely.  This is generally why a governmental organization will perform these in-depth reviews.  The personnel involved in the investigation must not be limited to only one organization, but rather all organizations that are involved in the incident.  Once action items that will improve safety and processes have been determined, they must be freely distributed to all other organizations participating in similar endeavors.  The alternative – to wait until similar disasters happen at other sites – is unacceptable.

Largest Egg Recall In US History

By Kim Smiley

Two Iowa farms have recently been at the center of the largest egg recall in US history.  Over half a billion eggs were recalled in August after more than 1,500 people were sickened by eggs tainted with salmonella.

How did this happen?  Where did the contamination come from?  How did tainted eggs make it onto supermarket shelves?

The investigation is still ongoing, but we can begin a root cause analysis of this problem by building a Cause Map.  A Cause Map provides a simple visual explanation of all the causes that were required to produce the incident.  A good place to start building a Cause Map is to identify the impacts to the organizational goals.  Causes are then added to the map by asking “why” questions.  (Click on the “Download PDF” button to view a Cause Map of this issue.)

In this example, we’ll consider the safety goal first.  The safety goal was impacted because nearly 1,500 people got sick because they consumed eggs that were contaminated with salmonella.  Why did they eat contaminated eggs?  Contaminated eggs were eaten because they were sold.  Why?  Because the eggs were contaminated at some point and there was inadequate regulation to prevent them from being sold.

Investigators are still determining the exact source of the contamination, but there is significant information available that can be added to the Cause Map.  The eggs were contaminated with salmonella because the hens laying the eggs were contaminated. (This strain of bacteria can be found inside a chicken’s ovaries and is passed on to eggs.)  The exact source that contaminated the hens is still being determined, but testing by the FDA has determined that the hens were likely contaminated after arriving at the farms.  FDA investigators have found a number of sanitation violations, including rodents which are a known carrier of salmonella.  Salmonella is not passed from hen to hen, but is typically passed from rodent droppings to chickens.

As more information comes available we can add to the Cause Map.  Hopefully, the investigation will result in solutions that can be applied and prevent this situation from occurring again.

A Serendipitous Solution

By Kim Smiley

Investigating the recent massive oil spill in the Gulf of Mexico is a tall order.  There are many contributing causes and a multitude of creative solutions are going to be needed to restore the environment.

During any investigation of this magnitude, there are guaranteed to be a few surprises.  And the Deep Horizon oil spill is no exception.

Scientists have discovered a previously unknown type of oil-eating bacteria feasting on oil from the spill.

This microbe is unique from previously studied varieties because it doesn’t consume large quantities of oxygen along with the oil.  Oxygen consumption is a concern because oxygen is needed in the sea to support life.

This microbe also thrives in cold water temperatures associated with the deep ocean, which might explain why it hasn’t been seen before.  Some scientists are theorizing that the microbe adapted in the deep ocean to consume the oil that naturally seeped from the ocean floor.  Since the huge influx of oil to the water, the bacteria populations have exploded.

Scientists are in a disagreement over how much oil remains in the Gulf, but there is no doubt that less is better.

This serendipitous solution is a welcome addition to the clean up efforts.  Obviously, there are many other solutions that will needed, but anything that safely reduces the overall amount of oil is a positive development.  Hopefully, with some additional research this microbe could be a potential solution to future incidents.

When performing an investigation, the unexpected sometimes happens.  The better understood the problem is, the easier it is to adapt to any new information. The Cause Mapping method of root cause analysis is an effective way to organize all information needed during an investigation.  Clearly understanding the causes that contribute to an incident will allow an organization to adapt as new information comes available and make sure that resources are used in the most efficient ways when implementing solutions.