Tag Archives: Investigation

Plane Dive Caused by Personal Camera Results in Court-Martial

By ThinkReliability Staff

On February 9, 2014, a Royal Air Force Voyager was transporting 189 passengers and a crew of 9 towards Afghanistan when the plane suddenly entered a steep dive. Many passengers were unrestrained and were injured by striking the ceiling or other objects. Other passengers were injured by flying objects or spills of hot liquid. More than 30 passengers and crew members reported injuries, all considered minor. The Military Aviation Authority’s final report contains details of the impacts from the dive, the causes of the dive, and recommendations that would reduce the possibility of a similar issue in the future.

These impacts, the cause-and-effect relationships that led to them, and the recommended solutions can be captured within a Cause Map. The Cause Map process begins with filling in a Problem Outline, which captures the what, when and where of an incident, followed by the impacts to the goals. The problem covered by the report is the aircraft dive and resulting injuries which occurred on February 9, 2014 at about 1549 (3:49 PM) on an Airbus A330-243 Voyager tanker air transport flight. Things that were different, unusual or unique at the time of the incident are also captured. In this case, the plane had experienced prior turbulence, and the co-pilot was not in his seat at the time of the dive.

The next step is to capture the impacts to the goals on the Outline. In this case, the safety goal is impacted because of a significant potential for fatalities, as well as the more than 30 actual injuries. Customer service is impacted due to the steep dive of the plane, and the regulatory goal is impacted due to the court-martial of the pilot, as well as 10 lawsuits against the Ministry of Defense. Production was impacted because the plane was grounded for 12 days, the property goal is impacted because of the potential for the loss of the whole plane, and the labor goal is impacted by the investigation.

Beginning with an impact to the goal, all the cause-and-effect relationships that led to that goal are captured on the Cause Map. In this case, the potential for fatalities resulted from the potential loss of the plane. According to Air Marshal Richard Garwood, previous director general of the UK’s Military Aviation Authority (MAA), “On this occasion, the A330 automatic self-protection systems likely prevented a disaster of significant scale. The loss of the aircraft was not an unrealistic possibility.” The potential for the loss of the plane resulted from the steep dive. The reason the plane was NOT lost (and this becomes a significant near miss) is the plane was recovered to level flight by the flight envelop protection system, which functioned as designed. (Although this is a positive, not a negative, it’s a cause all the same and should be included in the Cause Map.)

The steep dive resulted from the controller being forced forward without being counteracted. These are two separate causes that resulted in the effect, and are listed vertically and joined with an “AND” on the Cause Map. More detail should be provided about both causes. The command could not be counteracted because the co-pilot was not on the flight deck. He had been taking a break for several minutes before the incident. The investigation found that the controller was forced forward by a camera that was pushed against the controller. The camera had been placed between the seat and the controller, and then the seat was pushed towards (as is normal to occur during flight).

The investigation found that, despite concerns for about a year prior to this incident, loose personal articles were not prohibited on the flight deck. While there was a requirement to stow loose articles, it was not referenced in the operational manual and instead became one of thousands of paragraphs provided as background, resulting in a lack of awareness of controller interference from loose articles. The pilot was found to be using the camera while on the flight deck, likely due to boredom on the highly automated plane. (Analysis of the camera and flight recordings provided evidence.) The pilot was court-martialed for “negligently performing a duty, perjury and making a false record”, presumably at least partially due to the use of a personal camera while solo on the flight deck.

The report provided many recommendations as a result of the investigation, including increasing seat belt use by passengers and crew during rest periods, which would have reduced some of the injuries caused by unrestrained personnel striking the ceiling of the aircraft. Recommendations also included ensuring manufacturer’s safety advice is included in operational documents, promoting awareness of the danger of loose articles, and maximizing use of storage for loose articles, all of which aim to reduce the risk of loose articles contacting control equipment. An additional recommendation is to manage low in-flight pilot workload in an attempt to combat the boredom that can be experienced on long flights.

To view the Problem Outline, Cause Map, and recommendations, please click “Download PDF” above. Or click here to read the Military Aviation Authority’s report.

Train Derails on Track Just Inspected

By ThinkReliability Staff

A train derailment in the Columbia River Gorge near Mosier, Oregon resulted in a fire that burned for 14 hours. The Federal Railroad Administration (FRA) preliminary investigation says the June 3rd derailment was caused by a broken lag bolt which allowed the track to spread, resulting in the 16-car derailment. Although there is only one other known instance of a broken lag bolt causing a train derailment, the FRA determined that the bolt had been damaged for some time, and had been inspected within days of the incident, raising questions about the effectiveness of these inspections.

Determining all the causes of a complex issue such as a train derailment can be difficult, but doing so will provide the widest selection of possible solutions. A Cause Map, or visual root cause analysis, addresses all aspects of the issue by developing cause-and-effect relationships for all the causes based on the impacts to an organization’s goals. We can create a Cause Map based on the preliminary investigation. Additional causes and evidence can be added to the map as more detail is known.

The first step in the Cause Mapping process is to determine the impacts to the organization’s goals. While there were no injuries in this case, the massive fire resulting from the derailment posed a significant risk to responders and nearby citizens, an impact to the safety goal. The release of 42,000 gallons of oil (although much of it was burned off in the fire) is an impact to the environmental goal. The customer service goal is impacted by the evacuation of at least 50 homes and the regulatory goal is impacted by the potential for penalties, although the National Transportation Safety Board (NTSB) has said it will not investigate the incident. The state of Oregon has requested a halt on oil traffic, which would be an impact to the schedule goal. The property goal is impacted by the damage to the train cars, and the labor/ time goal is impacted by the response and investigation.

The analysis, which is the second step in the Cause Mapping process, begins with one of the impacted goals and develops cause-and-effect relationships by asking ‘Why’ questions. In this case, the safety goal is impacted by the high potential for injuries. This is caused by the massive fire, which burned for 14 hours. There may be more than one cause resulting in an effect, such as a fire, which is caused by heat, fuel, and oxygen. The oxygen in this case is from the atmosphere. The heat source is unknown but could have been a spark caused by the train derailment. The fire was fueled by the 42,000 gallons of crude released due to damage to train cars, which were transporting crude from the Bakken oil fields, caused by the derailment.

The derailment of 16 cars of the train was caused by the broken lag bolt. Any mechanical failure, such as a break, results from the stress on that object exceeding the strength of the object. In this case, the stress was caused by the weight of the 94-car train. The length of a train carrying crude oil is not limited by federal regulations. The strength of the bolts was reduced due to previous damage, which was not identified prior to the failure. While the track strength is evaluated every 18 months by the Gauge Restraint Measurement System (GRMS), it did not identify the damage. It’s unclear the last time it was performed.

Additionally, although the track is visually inspected twice a week by the railroad, it is done by vehicle, which would have made the damage harder to spot. The FRA does not require walking inspections. Nor does the FRA inspect or review the railroad’s inspections very often – there are less than 100 inspectors for the 140,000 miles of track across the country. There are only 3 in Oregon.

As a result of the derailment, the railroad has committed to replacing the existing bolts with heavy-duty ones, performing GRMS four times a year, enhanced hyrail inspections and visual track inspections three times a week, and performing walking inspections on lag curves monthly.

The FRA is still evaluating actions against the railroad and is again calling for the installation of advanced electronic brakes, or positive train control (PTC). It has also recommended PTC after other incidents, such as the deaths of two railroad workers on April 3 (see our previous blog) and the derailment in Philadelphia last year that killed 8 (see our previous blog).

To view a one-page PDF of the Cause Mapping investigation, click on “Download PDF” above. Or, click here to read the FRA’s preliminary investigation.

Don’t Just Google It . . . Maps Error Leads to Wrong House Being Demolished

By ThinkReliability Staff

Imagine coming “home” and finding an empty lot. That’s what happened in Rowlett, Texas on March 22, 2016. A tornado had previously damaged many of the homes in the area; some were slated for repairs, and some for demolition. The demolition company had plans to level the duplex at 7601 Cousteau Drive, but instead demolished the duplex at 7601 Calypso Drive.

An error on Google Maps has been blamed for the mistake but, as is typical with these types of incidents, there’s more to it than that. To ensure that all the causes leading to an incident are identified and addressed, it’s important to methodically analyze the issue. Creating a Cause Map, a form of root cause analysis that creates a map of cause-and-effect relationships is one way a problem can be analyzed.

The first step in the Cause Mapping process is to capture the what, when and where of an incident. Along with the geographic (where the incident occurred) and process location (what was being done at the time), it can be helpful to capture any differences about the situation surrounding the incident. In this case, “differences” would be anything out of the ordinary during the demolishing of the house at 7601 Cousteau/Calypso. The error on Google Maps (which pointed to the house which was mistakenly demolished) is one difference. Another difference is that the name of the street was not checked during the location confirmation. Other potential differences between this demolish job and other demolish jobs were that the same house number was present on both streets, in close proximity, and both houses experienced tornado damage. These differences may or may not be causally related – at this point, potential differences are just captured.

The next step is to capture the impacts to the organization’s goals as a result of the incident. These impacts to the goals become the first effects in the cause-and-effect relationships. In this case, there’s a potential for injuries (an impact to the safety goal) as a result of an unexpected demolition. The demolition of a house planned to be repaired is an impact to the environmental, customer service, and property goals. The demolition of the wrong house is an impact to the production/ schedule and labor/time goals.

The analysis begins with one of the impacted goals. Asking “why” questions develops cause-and-effect relationships. For example, the demolition of the wrong house was caused by the duplex at 7601 Calypso Drive being demolished while the duplex at 7601 Cousteau was planned for demolition. Because both of these facts (which can be verified with evidence) resulted in the wrong house being demolished, they are both connected to the cause of ‘demolition of wrong house” and joined with an “AND”.

Each cause on the map is also an effect. More detail can be added to the Cause Map by continuing to ask “why” questions. However, one cause may not be sufficient to result in an effect, so questions such as “what else was required?” are also necessary to ensure all causes are present on the map. In this case, the crew went to the wrong house because of an error on Google Maps, which was used to find the house. Per a Google spokeswoman, 7601 Cousteau was shown at the location of 7601 Calypso. This error has been identified as “the cause” of the incident. However, there were other opportunities to catch the error. Opportunities that were missed are also causes in the cause-and-effect relationship. While there was a site confirmation prior to demolition, only the street number (7601), lot location (corner lot), and tornado damage were confirmed. All three of these data points used to confirm the location were the same for 7601 Cousteau and 7601 Calypso.

What hasn’t been mentioned in the news but is apparent from looking at a (corrected) Google Map is that the house-numbering scheme of the neighborhood was set up for failure. 7601 Calypso is on the corner of Calypso Drive and Cousteau Drive, meaning a person could easily believe it was 7601 Cousteau. 7601 Cousteau is just a block away, on the corner of Cousteau Drive and an apparently unnamed alley. I can’t imagine it is the first time that someone has confused the two.

While it’s too late for 7601 Calypso Drive, Google Maps has fixed the error. Likely in the future this demolition company will use another identifier (or will mark the house while talking to the homeowners prior to the demolition) to ensure that the wrong house is not destroyed.

To view the Cause Map, as well as the updated Google Map, click on “download PDF” above.

DC Metro shut down for entire day after fire for inspections

By Kim Smiley 

A fire in a DC Metro tunnel early on March 14, 2016 caused delays on three subway lines and significant disruption to both the morning and evening commutes.  There were no injuries, but the similarities between this incident and the deadly smoke incident on January 12, 2015 (see our previous blog on this incident) led officials to order a 24-hour shutdown of the entire Metro system for inspections and repairs.

The investigation into the Metro fire is still ongoing, but the information that is known can be used to build an initial Cause Map.  A Cause Map is built by asking “why” questions and visually laying out all the causes that contributed to an incident.  Cause Mapping an issue can identify areas where it may be useful to dig into more detail to fully understand a problem and can help develop effective solutions.

So why was there a fire in the Metro tunnel?  Investigators have not released details about the exact cause, but have stated that the fire was caused by issues with a jumper cable.  Jumper cables are used in the Metro system to bridge gaps in the third rail, essentially functioning as extension cords.  The Metro system uses gaps in the third rail to create safer entry and exit spaces for both workers and passengers because of the potential danger of contact with the electrified third rail.  The third rail carries 750 volts of electricity used to power Metro trains and could cause serious injury or even death if accidently touched.

The jumper cables also carry high voltage and fires and/or smoke can occur if one malfunctions.  Investigators have not confirmed the exact issue that lead to this fire, but insulation failures have been identified in other locations and is a possible cause of the fire. (Possible causes can be added to the Cause Map with a “?” to indicate that more evidence is needed.)

One of the things that is always important to consider when investigating an incident is the frequency of occurrence of similar issues.  The scope of the investigation and possible solutions considered will likely be different if it was the 20th time an incident has occurred rather than the first. In this case, the fire was similar to another incident in January 2015 that caused a passenger death.  Having a second incident occur so soon after the first naturally raised questions about whether there were more unidentified issues with jumper cables.  The Metro system uses approximately 600 jumper cables and all were inspected during the day-long shutdown. Twenty-six issues were identified and repaired. Three locations had damage severe enough that Metro would have immediately stopped running trains through them if the extent of the damage had been known.

The General Manger of the DC Metro system, Paul J. Wiedefeld, is relatively new to his position and has been both praised and criticized for the shutdown.  Trying to implement solutions and reduce risk is always a balancing act between costs and benefits.  Was the cost of a full-day shutdown and inspections of all jumper cables worth the benefit of knowing that the cable jumpers have all been inspected and repaired?  At the end of the day, it’s a judgement call, but I personally would be more comfortable riding the Metro with my children now.

Heavy metal detected in moss in Portland

By Kim Smiley

Residents and officials are struggling to find a path forward after toxic heavy metals were unexpectedly found in samples of moss in Portland, Oregon. According to the U.S. Forest Service, the moss was sampled as part of an exploratory study to measure air pollution in Portland.  The objective of the study was to determine if moss could be used as a “bio-indicator” of hydrocarbons and heavy metals in air in an urban environment.  Researchers were caught off guard when the samples showed hot spots of relatively high heavy metal levels, including chromium, arsenic, and cadmium (which can cause cancer and kidney malfunction).  Portland officials and residents are working to determine the full extent of the problem and how it should be addressed.

So where did the heavy metals come from?  And how is it that officials weren’t already aware of the potential issue of heavy metals in the environment? The investigation into this issue is still ongoing, but an initial Cause Map can be built to document what is known at this time.  A Cause Map is built by asking “why” questions and visually laying out all the causes that contributed to the problem.  (Click on “Download the PDF” to view the initial Cause Map.)

Officials are still working to verify where the heavy metals are coming from, but early speculation is that nearby stained-glass manufacturers are the likely source.  Heavy metals are used during the glass manufacturing process to create colors. For example, cadmium is used to make red, yellow and orange glass and chromium is used to make green and blue glass. The hot spots where heavy metals were detected surround two stained-glass manufacturers, but there are other industrial facilities nearby that may have played a role as well.  There are still a lot of unknowns about the actual emissions emitted from the glass factories because no testing has been done up to this point.  Testing was not required by federal regulations because of the relatively small size of the factories.  If the heavy metals did in fact originate from the glass factories, many hard questions about the adequacy of current emissions regulations and testing requirements will need to be answered.

Part of the difficulty of this issue is understanding exactly what the impacts from the potential exposure to heavy metals might be.  Since the levels of heavy metals detected so far are considered below the threshold of “acute”,  investigators are still working to determine what the potential long-term health impacts might be.

A long-term benefit of this mess is the validation that moss can be used as an indicator of urban air quality.  Moss has been used as an “bio-indicator” for air quality since the 1960s in rural environments, but this the first attempt to sample moss to learn about air quality in an urban setting.  As moss is plentiful and testing it is relatively inexpensive, this technique may dramatically improve testing methods used in urban environments.

Both glass companies have voluntarily suspended working with chromium, cadmium and arsenic in response to a request by the Oregon Department of Environmental Quality.  The DEQ has also begun additional air monitoring and begun sampling soil in the impacted areas to determine the scope of the contamination. As officials gain a better understanding of what is causing the issue and what the long-term impacts are, they will be able to develop solutions to reduce the risk of similar problems occurring in the future.

Failure of the Nipigon River Bridge

By Kim Smiley

On the afternoon of January 10, 2016, the deck of the Nipigon River Bridge in Ontario unexpectedly shifted up about 2 feet, closing the bridge to all vehicle traffic for about a day.  After an inspection by government officials and the addition of 100 large cement blocks to lower the bridge deck, one lane was reopened to traffic, with the exception of oversized trucks. Heavier trucks are required to detour around the bridge with the main alternative route requiring crossing into the United States.  This failure is still being investigated and it isn’t known yet when it will be safe to open all lanes on the bridge.

More information is needed to understand all the details that led to this failure, but an initial Cause Map, a visual root cause analysis, can be built to illustrate what is currently known. The first step in the Cause Mapping process is to fill in the Outline to document the basic background information (the what, when and where) and the impacts to the organization’s goals resulting from the issue.  For this example, the bridge was damaged and significant resources will be needed to investigate the failure and repair the bridge.  The closure of the bridge, and subsequently having only a single open lane, is also having a sizable impact on transportation of both people and goods in the area.  It is estimated that about $100 million worth of goods are moved over the bridge daily and there are limited alternative routes.

Once the Outline is completed, the Cause Map is built by asking “why” questions and visually laying out the cause-and-effect relationships.  Why did the deck of the bridge shift up?  Investigators still don’t have the whole answer. The Nipigon River Bridge is a cable stayed bridge and bolts holding the bridge cables failed, resulting in the deck of the bridge being pulled up at an expansion joint.  Two independent testing facilities, National Research Council of Canada in Ottawa and Surface Science Western at Western University, are conducting tests to determine the cause of the bolt failures, but no information has been released at this time.

The Nipigon River Bridge is a new bridge that has only been open since November 29, 2015. Some hard questions about the adequacy of the bridge design have been asked because the failure occurred so soon after construction.  Officials have stated that the bridge design meets all applicable standards, but investigators will review the design and structure during the investigation to ensure it is safe.  Ontario winters can be harsh and investigators are going to look into whether cold temperatures and/or wind played a role in the failure.  Eyewitnesses have reported a large gust of wind just prior to the bolt failure.  Investigators will determine what role the wind played.

The Cause Map can easily be expanded to incorporate new information as it becomes available. Once the Cause Map is completed, the final step in the Cause Mapping process is to develop solutions to prevent a similar problem from recurring.  In this example, adding the concrete blocks as counter weights allowed one lane of the bridge to be opened in the short term, but clearly a longer-term solution will be needed to repair the bridge and ensure a similar failure does not occur again.

Landslide of construction debris buries town, kills dozens

By ThinkReliability Staff

Shenzhen, China has been growing fast. After a dump site closed in 2013, construction debris from the rapid expansion was being dumped everywhere. In an effort to contain the waste, a former rock quarry was converted to a dump site. Waste at the site reached 100 meters high, despite environmental assessments warning about the potential for erosion. On December 20, 2015, the worries of residents, construction workers and truckers came true when the debris slipped from the quarry, covering 380,000 square meters (or about 60 football fields) with thick soil as much as 4 stories high.

A Cause Map can be built to analyze this issue. One of the steps in the Cause Mapping process is to determine how the issue impacted the overall goals. In this case, the landslide severely impacted multiple goals. Primarily, the safety goal was impacted due to a significant number of deaths. 58 have been confirmed dead, and at least 25 are missing. The environmental goal and customer service goal were impacted due to the significant area covered by construction waste. The regulatory goal is impacted because 11 have been detained as part of an ongoing criminal investigation. The property goal is impacted by the 33 buildings that were destroyed. The labor goal is also impacted, as are more than 10,600 people participating in the rescue effort.

The Cause Map is built by visually laying out the cause-and-effect relationships that contributed to the landslide. Beginning with the impacted goals and asking “Why” questions develops the cause-and-effect relationships. The deaths and missing persons resulted from being buried in construction waste. Additionally, the confusion over the number of missing results from the many unregistered migrants in the rapidly growing area. The area was buried in construction waste when waste spread over a significant area, due to the landslide.

The landslide resulted from soil and debris that was piled 100 meters high, and unstable ground in a quarry. The quarry was repurposed as a waste dump in order to corral waste, which had previously been dumped anywhere after the closure of another dump. Waste and debris was piled so high because of the significant construction debris in the area. There was heavy construction in the area because of the rapid growth, resulting in a lot of debris. Incentives (dumpsite operators make money on each load dumped) encourage a high amount of waste dumping. Illegal dumping also adds to the total.

While an environmental impact report warned of potential erosion, and the workers and truck drivers at the dump registered concerns about the volume of waste, these warnings weren’t heeded. Experts point to multiple recent industrial accidents in China (such as the warehouse fire/ explosion in Tianjin in August, the subject of a previous blog) as evidence of the generally lax enforcement of regulations. Heavy rains contributed to ground instability, as did the height of the debris, and the use of the site as a quarry prior to being a waste dump.

Actions taken in other cities in similar circumstances include charging more for dumping debris in an effort to encourage the reuse of materials and monitoring dump trucks with GPS to minimize illegal dumping. These actions weren’t implemented in Shenzhen prior to the landslide, but this accident may prompt their implementation in the future. Before any of that can happen, Shenzhen has a long way to go cleaning up the construction debris covering the city.

Component Failure & Crew Response, Not Weather, Brought Down AirAsia Flight QZ8501

By Staff

Immediately following the December 28, 2014 crash of AirAsia flight QZ8501, severe weather in the area was believed to have been the cause of the loss of control of the plane. (See our previous blog on the crash.) However, recovery of the “black box” and a subsequent investigation determined that it was a component failure and the crew’s response to the upset condition that resulted in the crash and that weather was not responsible. This is an example of the importance of gathering evidence to support conclusions within an investigation.

Says Richard Quest, CNN’s aviation correspondent, “It’s a series of technical failures, but it’s the pilot response that leads to the plane crashing.” Because, as in common in these investigations, there is a combination of causes that resulted in the crash, it can help to lay out the cause-and-effect relationships. We will do this in a Cause Map, a visual form of root cause analysis. The Cause Map is built by beginning with an impact to the goals, such as the safety goal, and asking why questions.

The 162 deaths (all on board) resulted from the plane’s rapid (20,000 feet per minute) plunge into the sea. According to the investigation, the crash resulted from an upset/ stall condition AND the crew’s inability to recover from that condition. Because both of these causes contributed to the crash, they are both connected to the effect (crash) and separated with an “AND”.

More detail can be added to each “leg” of the Cause Map by continuing to ask “why” questions. The prolonged stall/ upset condition resulted from the aircraft being pushed beyond its limits. (It climbed 5,400 feet in about 30 seconds.) This occurred because of manual handling and because of the failure of the rudder travel limiter system, which is designed to restrict rudder movement to a safe range. The system failed due to a loss of electrical continuity from a cracked solder joint on a circuit board. Although maintenance records showed 23 complaints with the system in the year prior to the crash, it was not repaired. A former pilot and member of the investigation team stated it was considered “minor damage” and was “not a concern”.

The plane was being manually controlled because the autopilot and autothrust were disengaged. These systems were disengaged when a circuit breaker was reset (removed and replaced) to attempt to reset the system after a computer system failure (indicated by four alarms that sounded in the cockpit). While this is sometimes done on the ground, it shouldn’t be done in the air because it disengages the autopilot and autothrust systems. However, the crew had inadequate upset recovery training. According to the manual from the manufacturer the aircraft is designed to prevent it from becoming upset and therefore training is not necessary. The decision to manually place the plane in a steep climb is believed to have been an attempt to get out of the poor weather. Just prior to the crash, the less experienced co-pilot was at the controls.

The lack of crew training on upset conditions is also believed to have caused the crash. In addition, for at least some time prior to the crash, the pilot and co-pilot were working against each other by pushing their control sticks in opposite directions. The pilot was heard on the voice recorder calling for them to “pull down”, although “pulling” is used to bring the plane up.

The only recommendation that has so far been released is for commercial pilots to undergo flight simulator training for this type of emergency situation. AirAsia has already done so. The company, as well as the aviation industry as a whole, will hopefully look at the conclusions of the investigation report with a very critical eye towards improving safety.

NTSB recommends increased oversight of DC Metro

By Kim Smiley

On September 30, 2015, the National Transportation Safety Board (NTSB) issued urgent safety recommendations calling for the Federal Railroad Administration to take over the task of overseeing the Washington, DC Metro system. The NTSB has determined that the body presently charged with overseeing it (the Tri-State Oversight Committee) doesn’t provide adequate independent safety oversight.  Specifically, the Tri-State Oversight Committee doesn’t have the regulatory power to issue orders or levy fines and lacks enforcement authority.

The recommendations resulted from findings from the ongoing investigation into a smoke and electrical arcing accident in a Metro tunnel that killed one passenger and sent 86 others to the hospital.  (To learn more, read our previous blog “Passengers trapped in smoke-filled metro train”.) The severity of damage done to the components involved in the arcing incident have made it difficult to identify exactly what caused the arcing to occur, but the investigation uncovered problems with other electrical connections in the system that could potentially lead to similar issues if not fixed.

Investigators found that some electrical connections are at risk of short circuiting because moisture and contaminants may get into them because they were improperly constructed and/or installed.  The issues with the electrical components were not identified prior to this investigation which raises more questions about the Metro’s inspection and maintenance programs.  Although the final report on the incident has not been completed, the NTSB issued recommendations in June to address these electrical short circuit hazards because they required “immediate action” to ensure safety.

Investigators have found other issues with the aging DC Metro system such as leaks allowing significant water into the tunnels, issues with inadequate ventilation and questions about the adequacy of staff training.   The final report into the deadly arcing incident will include recommendations that go far beyond fixing one electrical issue on one run of track.

This example is a great illustration of how digging into the details of one specific problem will often reveal information about how to improve reliability across an organization. It may seem overwhelming to tackle organization-wide improvements, but often the best way to start is with an investigation into one issue and digging down into the details.

Waste Released from Gold King Mine

By Renata Martinez

On August 5, 2015 over 3 million gallons of waste was released from Gold King Mine into Cement Creek which then flowed into the Animas River. The orangish colored plume moved over 100 miles downstream from Silverton, Colorado through Durango reaching the San Juan River in New Mexico and eventually making its way to Lake Powell in Utah (although the EPA stated that the leading edge of the plume was no longer visible by the time it reached Lake Powell a week after the release occurred).

Some of the impacts were immediate.  No workers at the mine site were hurt in the incident but the collapse of the mine opening and release of water can be considered a near miss because there was potential for injuries. After the release, there were also potential health risks associated with the waste itself since it contained heavy metals.

Water sources along the river were impacted and there’s potential that local wells could be contaminated with the waste.   To mitigate the impacts, irrigation ditches that fed crops and livestock were shut down.  Additionally, the short-term impacts include closure of the Animas River for recreation (impacting tourism in Southwest Colorado) from August 5-14.

The long-term environmental impacts will be evaluated over time, but it appears that the waste may damage ecosystems in and along the plume’s path. There are ongoing investigations to assess the impact to wildlife and aquatic organisms, but so far the health effects from skin contact or incidental ingestion of contaminated river water are not considered significant.

“Based on the data we have seen so far, EPA and the Agency for Toxic Substances and Disease Registry (ATSDR) do not anticipate adverse health effects from exposure to the metals detected in the river water samples from skin contact or incidental (unintentional) ingestion. Similarly, the risk of adverse effects to livestock that may have been exposed to metals detected in river water samples from ingestion or skin contact is low. We continue to evaluate water quality at locations impacted by the release.”

The release occurred when the EPA was working to stabilize the existing adit (a horizontal shaft into a mine which is used for access or drainage). The force of the weight of a pool of waste in the mine overcame the strength of the adit, releasing the water into the environment.  The  EPA’s scope of work at Gold King Mine also included assessing the ongoing leaks from the mine to determine if the discharge could be diverted to retention ponds at the Red and Bonita sites.

The wastewater had been building up since the adit collapsed in 1995.  There are networks and tunnels that allow water to easily flow between the estimated 22,000 mine sites in Colorado.  As water flows through the sites it reacts with pyrite and oxygen to form sulfuric acid.  When the water is not treated and it contacts (naturally occurring) minerals such as zinc, lead, cadmium, copper and aluminum and breaks down the heavy metals, leaving tailings.  The mines involved in this incident were known to have been leaking waste for years.  In the 90s, the EPA agreed to postpone adding the site to the Superfund National Priorities List (NPL), so long as progress was made to improve the water quality of the Animas River.  Water quality improved until about 2005 at which point it was re-assessed.  Again in 2008, the EPA postponed efforts to include this area on the NPL.  From the available information, it’s unclear if this area and the waste pool would have been treated if the site was on the NPL.

In response, the “EPA is working closely with first responders and local and state officials to ensure the safety of citizens to water contaminated by the spill. ” Additionally, retention ponds have been built below the mine site to treat the water and continued sampling is taking place to monitor the water.

So how do we prevent this from happening again?  Mitigation efforts to prevent the release were unsuccessful.  This may have been because the amount of water contained in the mine was underestimated.  Alternatively, if the amount of water in the mine was anticipated (and the risk more obvious) perhaps the excavation work could have been planned differently to mitigate the collapse of the tunnel.  As a local resident, I’m especially curious to learn more facts about the specific incident (how and why it occurred) and how we are going to prevent this from recurring.

The EPA has additional information available (photos, sampling data, historic mine information) for reference: http://www2.epa.gov/goldkingmine