Missed by 150 Miles?

By Kim Smiley

On October 21, 2009, Northwest Airlines Flight 188 left San Diego and overshot its intended destination, Minneapolis-St. Paul by about 150 miles. Luckily, the incident resulted in a safe landing at the intended destination, but the circumstances surrounding the flight remain vague and unsettling.

One of the strangest facts that have come out is that the plane lost contact with air-traffic controllers for one hour and 18 minutes.  In the post 9/11 aviation environment, controllers are very sensitive to planes that quit responding to communications.  The Federal Aviation Administration had contacted military authorities about the possibility of terrorism.  Fighter jets were ready to respond and prepared to intercept the plane if necessary.

So what happened?  How did the pilots overshoot the airport by such a significant amount without realizing their mistake?

Initial reports were that the pilots stated that they were in a heated discussion and simply lost situational awareness, but many aviation experts have stated it is unlikely that pilots would miss repeated hails for over an hour because of an argument.  Other reports have speculated that maybe both pilots fell asleep.

The most recent information to come out is that the pilots were using their laptops during the time they failed to respond to hails.  The pilots stated that they both were working on laptops and that they were discussing monthly flight crew scheduling.

Details concerning the overshoot are still being investigated, but an initial root cause analysis can be started to help document the investigation as it progresses.  This is what an Outline could look like at this stage:

A preliminary Cause Map can be started at this stage of an investigation.  As more information is known a detailed Cause Map can be built to document all the relent information.

More data should be available soon.  The Cockpit Voice Recorder and the Flight Data Recorder have both been sent to the National Transportation Safety Board for analysis and interviews of all involved parties continue.

On October 28, it was announced that the FAA has revoked the licenses of the two pilots involved because they violated several federal regulations, including fail to comply with air traffic control instructions and operating carelessly and recklessly. There are no currently specific federal rules banning the use of laptops after the flight reaches 10,000 feet at this time.

Genesis Spacecraft Crash

By Kim Smiley

The mission of the Genesis spacecraft was to collect the first samples of the solar wind and return the samples to earth to be analyzed. The goal was to provide fundamental data to help scientists determine the composition of the sun and learn more about the formation of our solar system.

Unfortunately, during descent on September 8, 2004, the Genesis crashed into the earth at high velocity. Its descent was only slowed by air resistance and the collection capsule was damaged on impact.

What happened? What went wrong with the re-entry?

A root cause analysis can be performed to evaluate this incident. The investigation can be documented by building a Cause Map that collects all the information associated with the incident in a visual format that is easy to follow.

In this case, the main goal we’ll consider is the production goal. The production goal was impacted because the collection capsule was damaged, which had the potential to destroy all the physical data collected during the three year mission.

The investigation can proceed by asking “why” questions and adding the causes to the Cause Map. In this scenario, the collection capsule was damaged because it impacted the earth at high velocity. This occurred because the parachute that was intended to slow the descent to allow for a midair recovery by helicopter failed to deploy.

Post-accident investigation determined that the parachute was never triggered to deploy because gravity switches were installed backwards. The backward installation occurred for several reasons: the design was flawed, the design review process didn’t detect the error and the testing performed didn’t detect the error.

Luckily, the impact to the production goal has been less significant than it might have been in this case. The collection capsule was cushioned somewhat by the soft ground and while desert dirt entered the capsule, liquid water did not. The solar wind particles were embedded in the collection materials and the contaminating dirt was able to be removed for the most part. NASA has been able to retrieve significant amounts of data from the mission.

NASA’s Mishap Report can be downloaded for free for additional information on the incident.

A one page PDF showing a high level Cause Map of the incident can be downloaded by clicking on the button above.

Sugar Dust Explosion

By ThinkReliability Staff

On February 7, 2008, an explosion at the sugar refinery in Port Wentworth, Georgia resulted in the deaths of 14 workers.  It also injured 36 and caused significant damage to the refinery.  Immediately following the incident, we began a very simple root cause analysis, leaving the more detailed analysis for when the Chemical Safety Board (CSB) report was released and more detailed information could be found.  The CSB final draft reportwas recently issued and with the information it contains, we can add more detail to our Cause Map.

We can begin our analysis by beginning with a goal that was impacted and using the “5-whys” approach.  The 14 deaths and 36 injuries were caused by the propagation of secondary explosions and fire.  The secondary explosions and fires were caused by a primary explosion, which was caused by an explosive concentration of sugar dust, which was caused by inadequate housekeeping.

From here we can add more detail to our map.  For example, difficulty evacuating the plant was also a cause of the deaths and injuries.  The difficulty was caused by having no evacuation drills, and using cell phones and radios to communicate instead of an intercom or emergency alert system.

In order for the explosions to propagate, they needed additional fuel.  This was found in the accumulated sugar dust in open areas of the plant, due to inadequate housekeeping, and a dust removal system that was not functioning properly and had ducts filled with sugar dust.

Since “inadequate housekeeping” has now come up twice on our map, let’s expand on that a little.  There was a lack of awareness of the hazards of sugar dust.  The facility risk assessment did not address these hazards, there was very little training on dust hazards, and there was little regulatory oversight which might have created more awareness or cleanliness requirements.  OSHA’s hazardous dust safety standards were limited to grain, and the State of Georgia had no regulations addressing dust.  (Both of these issues are in the process of being fixed.)

Although the sugar dust accumulated due to lack of housekeeping, it required more to reach explosive levels.  The containment was provided by steel panels installed around the conveyor which were designed to protect the sugar from contamination.   The dust also required an ignition source.  Due to the extensive damage, the CSB was not able to pinpoint the ignition source.

The CSB identified several solutions that would mitigate the risk of future incidents.  Some of these solutions are for Imperial Sugar to implement at this site, such as holding evacuation drills, increasing training on dust hazards, improving the housekeeping program, and installing (and using) an intercom system.  As discussed above, OSHA and the State of Georgia are implementing standards and regulations to decrease the chances of a dust explosion in their jurisdictions.  Also, the CSB has recommended that the company who performed the risk assessment at Imperial Sugar consider dust hazards as a risk.

Click on “Download PDF” above to see all the information discussed above in a visual form.

Learn more about dust explosions.

The Space Junk Problem

By Kim Smiley

The Defense Advanced Research Projects Agency (known as DARPA) issued a request for ideas on how to clean up orbital debris, commonly known as space junk, last week. The term space junk refers to all the objects currently in orbit around earth that no longer serve a useful purpose.

Why would DARPA want to put effort into removing space junk?  Why is it a problem?

A root cause analysis of this issue can be performed.  The first step is to identify the problem.  Then the investigation can be documented as a Cause Map and the causes contributing to the space junk problem should be investigated. In this case, the problem is that space junk poses a threat to unmanned and manned spacecraft, including satellites.

Space junk comes from a variety of sources (which will be discussed later) and is a wide variety of sizes. Impacts with large debris (greater than 1 kilogram) can destroy spacecraft at orbital velocities.  The only protection currently available is to move the spacecraft out of the path of space junk. Impacts with tiny debris cause erosion damage and can substantially shorten the life span of spacecraft.  Solar panels and windows are especially vulnerable to this type of damage.

Destroyed spacecraft then become part of the problem as long as they remain in orbit as defunct space junk themselves.

In addition to nonfunctioning, dead spacecraft, some of the causes of space junk are boosters from past spacecraft launches, lost equipment, and debris from weapons testing.  These causes should all be added to the cause map.

The problems associated space junk continue to increase and with more and more debris is created in earth’s orbit.

The largest space debris incident in history occurred in 2007 after China performed an anti satellite missile test and intentionally blew up a defunct satellite.  This test also targeted a satellite in the most heavily populated area of earth’s orbit.

Currently, the Space Surveillance Network tracks more than 20,000 objects in orbit.  And this number only includes those large enough to track.  There are estimated to be thousands of objects too small to track currently in orbit.

Hopefully DARPA is able to find an effective solution to mitigate the problem and reduce the risk posed by space junk.

Click on the “Download PDF” bottom above to view an intermediate level Cause Map of this issue.