The most critical sectors of the American economy were affected by 245 "cyberincidents" last year, according to the Department of Homeland Security. As high as that number seems, however, security experts caution the real number may be much higher.
Turns out, there's a huge gulf between the Internet-related attacks the department's Industrial Control System Cyber Emergency Response Team recorded for the country's critical infrastructure – important areas such as energy, manufacturing, agriculture, and healthcare – and the true number of malfunctions, technological failures, or other happenings within those sectors.
The discrepancy comes down to widespread uncertainty of when something should be classified as a "cyberincident" in the first place.
This lack of consensus, security experts warn, may actually cause many cyberattacks on critical infrastructure to go undetected or unrecognized altogether, especially since a malicious attack could first appear like technical glitches or human error.
Joe Weiss, a veteran of the nuclear power sector, is at the vanguard of those arguing that the critical infrastructure sector is failing to adequately account for cyberincidents. Mr. Weiss has been tracking cyberincidents involving infrastructure using the definition laid out by the National Institute of Standards and Technology (NIST). To date, he claims to have identified and detailed almost 400 control system cyberincidents in a database, many of which are not recognized as such by industry.
Generally, NIST considers a cyberincident to be any situation in which a failure in electronic communications leads to a loss confidentiality, integrity, or availability. Malicious incidents such as a distributed denial of service attack or hacking control system software certainly qualify, but its definition of “incident” is far broader than just cyberattacks or malicious actions.
What's 'cyber' and what's not
In its annual report for 2014, Homeland Security acknowledged that many malicious cyberincidents go unreported – possibly because critical infrastructure owners are wary of bad publicity, or because they determine that they have the incident under control and do not need outside assistance in managing it.
Those missed reports about malicious incidents are an important source of data about threats to control systems. But Weiss argues that the dangers posed by nonmalicious cyberincidents are just as substantial. Such events often end up as part of the public record, but their details are buried in reports by government agencies or regulators that do not view them as such.
“Incidents don’t have to be malicious to cause bad things to happen,” says Weiss, managing partner at the security firm Applied Control Solutions. “In fact, nonmalicious incidents are the most probable and frequent incidents that occur.”
His list includes some of the most deadly and destructive public sector accidents of the last two decades – events that are not generally considered "cyberincidents" by NIST or within critical infrastructure circles. Among them: a 2006 emergency shutdown of Unit 3 at the Browns Ferry nuclear plant in Alabama, the 1999 Olympic Gas pipeline rupture and explosion in Bellingham Washington that killed three people and the 2010 Pacific Gas & Electric gas pipe explosion in San Bruno, Calif., that killed eight people and destroyed a suburban neighborhood.
None of those incidents are believed to be the result of a malicious attack. However, in each of them, there was a failure in Supervisory Control and Data Acquisition (SCADA) software that managed critical infrastructure. That failure contributed – directly or indirectly – to the subsequent accident. In at least one case, the SCADA failure was the primary cause of the accident.
In the case of the Bellingham pipeline rupture, for example, a report by the National Transportation Safety Board noted that SCADA software used to control gas distribution pipelines became unresponsive and failed to execute a command to activate a pump needed to relieve a gas pressure buildup in a 16 inch distribution pipeline. An extreme pressure buildup that was the result led to a rupture in the pipeline that caused a massive gas leak and explosion that killed three people.
Similarly, the devastating San Bruno pipeline explosion in 2010 was the result of a loss of power to a SCADA system that stemmed from a mishap during scheduled replacement of an Uninterruptable Power Supply. A sudden drop in power triggered regulating valves within the network of pipelines to switch from a fully closed position to a fully open position. That allowed gas pressure in a number of gas supply to increase dramatically, leading to the pipeline rupture. The resulting explosion vaporized part of a residential neighborhood and left a crater 72 feet long by 26 feet wide. A 28-foot long section of pipe weighing 3,000 pounds was ejected from the crater and thrown 100 feet from the site of the explosion.
By any reading of the NIST guidelines for identifying cyberincidents, all three qualify. But none are explicitly characterized as such, nor are they understood within critical infrastructure circles to be evidence of deadly "cyberincidents."
Should 'cyber' amount to malicious intent?
Weiss says he has found many other, similar omissions that continue even today. One obstacle to properly identifying such incidents is that the popular understanding of a cyberincident borrows too much from the information technology industry, which focuses on malicious actors and software based threats operating in traditional IT environments.
“In the IT world, ‘cyber’ is equated with malicious attacks,” Weiss said. “You’re worried about a data breach and stolen data, or denial of service attacks.”
That’s too limited for the industrial control sector, where the cause or intent of adverse incidents isn’t as important in the context of industrial control as their effect – such as a loss of availability. Security – including cybersecurity – is an issue only insofar as it affects the operation of critical systems, Weiss says.
Weiss argues that applying an IT mindset to critical infrastructure results in operators overlooking weaknesses in their systems. “San Bruno wasn’t malicious, but it easily could have been,” Weiss notes. “It’s a nonmalicious event that killed 8 people and destroyed a neighborhood.”
However, other experts who advise critical infrastructure firms say there are many explanations for such lapses in the context of critical infrastructure. For one: critical infrastructure operators face intense pressure from their regulators to report adverse incidents may be another obstacle to properly identifying cyberincidents and classifying them as such.
“Even in a retail cyberevent, it’s hard to know what happened without really digging in and doing the forensics – and that can take months,” said Mark Weatherford, a principal at The Chertoff Group. However, government regulations such as the Department of Energy’s OE-417 require grid operators to report any qualifying “electric emergency incident and disturbance” within one to six hours of the event.
“There’s such a priority on reporting that I think some organizations are fearful that they’ll report too much too soon,” Mr. Weatherford said.
Weiss declined to allow Passcode to view his list of 400 cyberincidents and says he hasn’t publicized the list. However, he has used his blog to make the case for a broad reading of “cyberincident.”
In a recent post, for example, he suggested a “smoke event” on the Washington Metro that resulted in one death might be properly classified as a cyberincident because a subsequent National Transportation Safety Board report cited aging control system software as a contributor to the event. The software prevented Metro operators from pinpointing the exact location of the smoking wires that were the source of the incident.
Support for a limited cyber label
However, his broad interpretation of cyberincident isn’t widely shared within the industry.
“I believe that lumping everything into a single ‘cyberincident’ category is a step in the wrong direction,” said Dale Peterson of the security firm Digital Bond. “I don't object to the term, but I'm not sure what we have accomplished.”
Instead, Mr. Peterson argues that the critical infrastructure sector needs to start building what he refers to as a “taxonomy of incidents and impacts” that will help critical infrastructure owners learn from others in planning the defense of their own IT and industrial control system networks.
Ralph Langner, an internationally recognized expert on the Stuxnet attack and industrial control systems security, is working on just such a taxonomy to help inject engineering methodologies into the practice of identifying and fixing vulnerabilities in critical infrastructure.
At a recent talk at a Miami security conference, Mr. Langner said that he tries not to speculate on potential attacks or cyber adversaries. “What interests me most are the contributing characteristics of a potential target … . Is there any direct path to disaster? Is there a critical risk? This is what drives our analysis.”
But Peterson doesn’t agree that the "who" or "what" behind a failure in a control system is irrelevant. “Security controls to prevent or detect intentional malicious action would be different than security controls to prevent or detect a failure or operator/engineer error,” he said. “Conflating the two would be similar to saying you should have the same security measures to protect your valuables from a burglar and a hurricane.”
The original version of this story was updated to clarify the type of information recorded by the Industrial Control System Cyber Emergency Response Team.