Two DC Metro Workers Killed

By Kim Smiley

On January 26, 2010 just before 2 am, two Metro workers were killed near the Rockville metro station.  They were crushed by a metro utility vehicle while working on the track to install safety equipment.

The utility vehicle is a gas powered truck that is designed to operate on the track when electricity is shut off.  They are called high-rail vehicles and are typically used to carry equipment.  At the time of the accident, the vehicle was placing devices that tell approaching trains that there is a work crew in the area.

Many details of this accident are not available yet, but a preliminary root cause analysis can be started.  The basic information can be documented in an Outline and an initial cause map can be started.  Click on the “Download PDF” button above to see what this would look like.

The men killed and the workers in the vehicle were not part of the same crew and it’s not clear why the driver of the truck wasn’t aware that workers were in the area.  At the time of the accident the vehicle was traveling in reverse, which is a routine mode of operation.

Safety regulations require all vehicle operators to be informed about work crew locations, but it isn’t clear if that is being done effectively.

The National Transportation Safety Board (NTSB) has begun to investigate this incident and more details should be available as their investigation progresses.   The NTSB is currently reviewing employee work history and training and gathering all relevant data such as radio recordings and work procedures.

The DC Metro system has the worst safety record of any metro system in the country.  Five workers have now been killed while on the tracks in the last seven months.  There was also a metro train accident that killed 9 people on June 22, 2009.  To see a cause map of the June accident, click here.

Tragedy in Bhopal

By ThinkReliability Staff

While researching the tragedy in Bhopal, India, I discovered that there are two theories about what occurred on December 3, 1984 that resulted in a tremendous loss of life. One theory is from a report done by an Engineering Consulting firm hired by Union Carbide (the company that owned the plant in question) that determines that the release was caused by sabotage. Theory #2 is that a combination of inexperienced, ineffective workers and a badly maintained plant with inadequate safety standards that was being ready for dismantling experienced a horribly catastrophic chain of events that ensured that anything that could go wrong, did. For completeness, I have included both in my final Cause Map (which you can see by clicking “Download PDF” above). But for now, I’d just like to focus on the second.

In the wee morning hours of December 3, 1984, over 40 tons (this amount is also debated, but 40 tons appears to be the most popular, purely based on number of references that mention it) of methyl isocyanate (MIC) were released over the community of Bhopal, India, with a population of 900,000. Partially because of the transient nature of the population, and partially due to the general obfuscation of data from all sources involved, the number killed ranges from 2,000 to 15,000. The 2003 annual report of the Madhya Pradesh Gas Relief and Rehabilitation Department stated that a total of 15,248 people had died as a result of the gas leak. Based on claims accepted by the Indian government, there were at least 500,000 injured. This led to what has been called “The World’s Largest Lawsuit”, which I assume refers to the number of people represented, and certainly not the monetary amount of the settlement, which is a paltry $470 million. After the accident, the plant, after a series of legal maneuvers, was abandoned. Extensive cleanup was required, and still has not been completed. The impact to the goals are shown in the outline on the downloadable PDF.

The deaths and environmental impact were caused by the release of over 40 tons of methyl isocyanate (from here on out, we’ll refer to it as MIC). The release occurred when a large volume of MIC was put through an ineffective protection system. The release lasted several hours, because workers were unable to stop it, and because of an ineffective warning system. The release occurred when a disk and valve that led to the protection system burst due to an increase in pressure. The increase in pressure was caused by an increase in temperature resulting from a reaction between MIC and water when the refrigeration system was shut down. There were 41 metric tons of MIC in the tank, stored for use in the plant. How the water was introduced is the debate in the two theories I mentioned above. But regardless, water got in to the tank, either by sabotage or by leaking through a vent line. We will probably never know exactly what happened. But we do know that ineffective safety systems can result in a massive loss of life, as happened here.

Today in History: Fire on the USS Enterprise

By ThinkReliability Staff

On January 13, 1969, 31 years ago, fires and explosions broke out on the USS Enterprise (CVN-65). The crewmembers spent three hours fighting the fire. When the smoke cleared, 27 crewmembers were killed and 314 were injured. Additionally, 15 aircraft were destroyed and the carrier was severely damaged.

We can address the impacts to the U.S. Navy’s goals in a problem outline as the first step of the Cause Mapping process. There was an impact to the safety goal because crewmembers were killed and injured. There was an impact to the property goal because of the 15 planes that were damaged, and the repairs that were required to the ship. (This is also an impact to the labor goal, because of the labor required for the repairs.) Additionally, the ship’s deployment was delayed, which is an impact to both the customer service and production/schedule goals.

After we’ve completed the outline, we build our Cause Map beginning with the goals that were impacted. The goals were impacted by a series of explosions and fires across the ship. These explosions and fires were fueled by jet fuel and bombs that were found on the planes on the flight deck of the carrier. The initiating event was the explosion of a Mk-32 Zuni rocket, which exploded when it overheated due to being put in the exhaust path of an aircraft starting unit.

After the incident, the Navy performed an investigation to review the causes of the incident, and made changes to improve safety. Repairs to the Enterprise were completed, and the ship is now the oldest active serving ship in the U.S. Navy.

A thorough root cause analysis built as a Cause Map can capture all of the causes in a simple, intuitive format that fits on one page. To view the downloadable PDF, click “Download PDF” above.

More on the Disappearance of Flight 188

By ThinkReliability Staff

In our previous blog about Flight 188 of Northwest Airlines, we discussed the first step of a root cause analysis investigation – defining the problem – and mentioned that a detailed Cause Map could be developed when more information regarding the incident was released.

The National Transportation Safety Board (NTSB) has recently released a report on what exactly happened to the flight. We can build off of the outline we already developed to put together the Cause Map, or visual root cause analysis.

First we begin with the impacts to the goals. Most importantly, the safety and property goals were impacted due to the potential danger to the flight. This was caused by the plane overshooting the destination. The pilots flew over the destination because they were distracted, warnings were not effectively delivered to them, and they couldn’t see their destination (Minneapolis-St. Paul), since it was after dark and cloudy.

The pilots were distracted by a non-operation activity. The two pilots were utilizing the scheduling software on their laptops, both of which were open in the cockpit (possibly blocking some of the flight display). Both using personal laptops and participating in non-operational activities is prohibited by the airline.

Some may ask how it’s possible that two pilots who were flying a plane – with over a hundred passengers – could be spending all their energy on another activity. Well, the pilots did not actually have any active tasks to fly the plane. The plane was on auto-pilot, and the one task that pilots ordinarily did on a regular basis (which would have certainly alerted the pilots to their position) was sending a position report. However, a dispatcher for the airliner had asked the pilots NOT to send a report, as the reports were burdensome and unneccessary.

Warnings did not effectively get through to the pilots by sight – either the flight display was physically blocked by the laptop or the pilots weren’t looking at it because they were distracted – or sound – the plane was not equipped to send audible message (such as chimes or buzzers) to the pilots, text messages sent to them were not acknowledged, and the pilots did not hear calls for them on the radio. The air traffic controllers (who were different from the air traffic controllers who had first had contact with the plane) did not know which frequency the plane was on, so only some messages got through. Because the pilots were using the speaker instead of headsets and were, again, distracted, they missed the messages.

Both of the pilots involved had their licenses revoked. Several procedures were not followed in this instance, and the FAA and individual airlines are working on highlighting the importance of these procedures. Reading about this incident (and seeing that the pilots’ license were revoked) will probably do much to highlight the importance of the procedures. Luckily, nobody was hurt for this lesson to be learned.

View the root cause analysis investigation by clicking “Download PDF” above.