Tag Archives: cyber security

App Takes Down National Weather Service Website

By Kim Smiley

The National Weather Service (NWS) website was down for hours on August 25, 2014.  Emergency weather alerts such as tornado warnings were still disseminated through other channels, but this issue raises questions about the robustness of a vital website.

This issue can be analyzed by building a Cause Map, a visual format for performing a root cause analysis.  Cause Maps are built by laying out all the causes that contributed to a problem to show the cause-and-effect relationships.  The idea is to identify all the causes (plural), not just THE one root cause.

This example is a good illustration of the potential danger of focusing on a single root cause.  The NWS website outage was caused by an abusive Android app that bogged the site down with excessive traffic.  The app was designed to provide current weather information and it pulled data directly from the forecast.weather.gov website.  The app inadvertently queried the website thousands of times a second because of a programming error and the website was essentially overwhelmed.  It was similar to the denial of service attacks that have been directed at websites such as Bank of America and Citigroup, but the spike in traffic in this case wasn’t deliberate.

It may be tempting to say that the app was the root cause. Or you could be more specific and say the programming error was the root cause.  But labeling either of these “the root cause” would imply that you solved the problem once you fix the software error. The root cause is gone, no more problem…right?  In order to address the issue, NWS installed a filter to block the excessive queries and worked with app developer to ensure the error was fixed, but there are other factors that must be considered to effectively reduce the risk of a similar problem recurring.

One of the things that must be considered in this example is why a filter that blocked denial of service attacks wasn’t already in place.  Flooding a website with excessive traffic is a well-known strategy of hackers.  If an app could accidently take the site down for hours, it is worrisome to consider what somebody with malicious intent could do.  The NWS is responsible for disseminating important safety information to the public and needs a reasonably robust website.  In order to reduce the impact of a similar issue in the future, the NWS needs to evaluate the protections they have in place for their website and see if any other safeguards should be implemented beyond the filter that addressed this specific issue.

If the investigation was focused too narrowly on a single root cause, the entire discussion of cyber security could be missed.  Building a Cause Map of many causes ensures that a wider variety of solutions are considered and that can lead to more effective risk prevention.

To view a high level Cause Map of this issue, click on “Download PDF” above.

Department of Energy Cyber Breach Affects Thousands, Costs Millions

By ThinkReliability  Staff

Personally identifiable information (PII), including social security numbers (SSNs) and banking information, for more than 104,000 individuals currently or formerly employed by the Department of Energy (DOE) was accessed by hackers from the Department’s Employee Data Repository database (DOEInfo) through the Department’s Management Information System (MIS).  A review by the DOE’s  Inspector General in a recently released special report analyzes the causes of the breach and provides recommendations for preventing or mitigating future breaches.

The report notes that, “While we did not identify a single point of failure that led to the MIS/DOEInfo breach, the combination of the technical and managerial problems we observed set the stage for individuals with  malicious intent to access the system with what appeared to be relative ease.”  Because of the complex interactions between the systems, personnel interactions and safety precautions (or lack thereof) that led to system access by hackers, a diagram showing the cause-and-effect relationships can be helpful.  Here those relationships – and the impacts it had on the DOE and DOE personnel – are captured within a Cause Map, a form of visual root cause analysis.

In this case, the report uncovered concerns that other systems were at risk for compromise – and that a breach of those systems could impact public health and safety.  The loss of PII for hundreds of thousands of personnel can be considered an impact to the customer service goal.  The event (combined with two other cyber breaches since May 2011), has resulted in a loss of confidence in cyber security at the Department, an impact to the mission goal.  Affected employees were given 4 hours of authorized leave to deal with potential impacts from the breach, impacting both the production and labor goals.  (Labor costs for recovery and lost productivity are estimated to cost $2.1 million.)  The Department has paid for credit monitoring and established a call center for the affected individuals, at an additional cost of $1.6 million, leading to a cost of this event of $3.7 million.  With an average of one cyber breach a year for the past 3 years, the Department could be looking at multi-million dollar annual costs related to cyber breaches.

These impacts to the goals resulted from hackers gaining access to unencrypted PPI.  Hackers were able to gain access to the system, which was encrypted, and contained significant amounts of PPI, as this database was the central repository for current and former employees.  The PPI within the database included SSNs which were used for identifiers, though this is contrary to Federal guidance.  There appeared to have been no effort to remove SSNs as identifiers per a 5-year-old requirement for reasons that are unknown.  Reasons for the system remaining unencrypted appear to have been based on performance concerns, though these were not well documented or understood.

Hackers were able to “access the system with what appeared to be relative ease” because the system had inadequate security controls (only a user name and password were required for access), and could be directly accessed from the internet, presumably in order to accomplish necessary tasks.   In the report, ability to access the system was directly related to “continued operation with known vulnerabilities.”  This concept may be familiar to many at a time when most organizations are trying to do more with less.   Along with a perceived lack of authority to restrict operation, inability to address these vulnerabilities based on unclear responsibility for applying patches, and vulnerabilities that were unknown because of the limited development, testing, troubleshooting and ongoing scanning of the system, cost was also brought up as a potential issue for delay in addressing the vulnerabilities that contributed to the system breach.

According to the report, “The Department should have considered costs associated with mitigating a system breach … We noted the Department procured the updated version in March 2013 for approximately $4,200. That amount coupled with labor costs associated with testing and installing the upgrade were significantly less than the cost to mitigate the affected system, notify affected individuals of the compromise of PII and rebuild the Department’s reputation.”

The updated system referred to  was purchased in March 2013 though the system had not been updated since early 2011 and core support for the application upon which the system was built ended in July 2012.  Additionally, “the vulnerability exploited by the attacker was specifically identified by the vendor in January  2013.”  The update, though purchased in March,  was not installed until after the breach occurred.  Officials  stated that a decision to upgrade the system had not been made until December 2012, because it had not reached the end of its useful life.”  The Inspector General ‘s note about considering costs of mitigating a system breach is poignant, comparing the several thousand dollar cost of an on-time upgrade to a several million dollar cost of mitigating a breach.   However, like the DOE, many companies find themselves in the same situation, cutting costs on prevention and paying exponential higher costs to deal with the inevitable problem that will arise.

To view the Outline, Cause Map and recommended solutions based on the DOE Inspector General’s report, please click “Download PDF” above.  Or click here to read more.