August 14, 2003 was a hot day in the Northeastern United States, but not extremely so. Power lines carrying electricity to New York, Washington, D.C., Toronto, and other major cities in North America were heavily taxed, but the loads were not unusual for an August afternoon.
On this particular Thursday, however, in rural Ohio, a power line happened to come into contact with a tree limb. This caused a power plant to go offline, then another. The shifting loads caused a Toronto-bound power surge, setting off a chain reaction that eventually brought some 100 power plants down, affecting more than 55 million people in eight US states and Ontario and bringing the Northeast’s commuter trains to a standstill.
In New York City, people walked down dozens of flights of stairs and then many miles more to their homes during the hottest part of a summer day. Air conditioning systems failed, putting the elderly and the ill at risk. Phone service was interrupted as increased demand overloaded cell towers. Some municipal water systems lost pressure, prompting cities to advise their residents to boil their water before using it. Radio stations whose backup generators failed went off the air temporarily, frustrating the efforts of authorities to transmit emergency response instructions. Air transport and financial markets were disrupted. In some remote areas, power was out for a week.
An investigation led authorities to a bug in the software of an Ohio power company’s control room that helped turn what should have been a manageable local blackout into a cascading regional failure. But the outage convinced even skeptics that the electrical grid was operating well beyond its design capabilities. Washington was swift to act, handing new responsibilities to the agencies that regulate operations and investment in the power grid.
Despite our best efforts, widespread power outages persist. Every several years now, a large portion of the US electrical grid collapses in similarly spectacular fashion, disrupting millions of lives. And each time, utility executives are called on the carpet. Billions of dollars are spent to harden the power grid. Laws are enacted to provide assurances that a disruptive power outage “never happens again.”
And still, blackouts do happen, over and over. At the time, Northeastern Blackout of 2003 was the second largest in world history, but since then, seven outages of even greater severity have occurred, with a 2012 blackout in India affecting one out of ten people on the planet.
Can disruptive blackouts be prevented? Does the ever-increasing complexity of our electrical supply system all but ensure more frequent and more catastrophic failures?
Unicycles and spinning plates
Electric power grids are marvelously complicated and intricate systems, comprising many millions of interconnected turbines, conductors, transmission lines, insulators, switches, and people. They tend to be enormous. The whole of the North American continent is served by just four or five regional grids.
The reasons for this complexity are perfectly sensible. For more than a century it has been cheaper to produce and distribute on a large scale, and cities and states have linked up their own utility grids with the growing network to increase redundancy (which added to their own systems’ reliabilities) and to make trading in electrons possible. Bit by bit, the most intricate supply system ever created by humans took its form.
As a result, the behavior of our power grid is undeniably and irrevocably complex. The electricity that powers the glowing screen on which you are now reading is the result of millions of interconnected devices working together in a highly synchronized way. Each of these elements behaves individually according to laws of physics that are easy to describe and predict.
But the system as a whole behaves in ways that are impossible to understand just by adding up the behaviors of these predictable parts. In other words, we know how the power grid works in theory. How it manages to work in practice is, even to trained professionals, often a mystery.
This complexity arises from a paradox: power grids are both inherently robust and inherently fragile. Operating a power grid is a bit like that old circus act of balancing spinning plates atop poles while riding a unicycle. Getting it all started is nearly impossible, but once you achieve an equilibrium, with all the plates spinning and the unicycle moving, maintaining that balance is somewhat easier—as long as you keep up the momentums of the various spinning parts.
Of course, this seemingly miraculous balance is delicate; the smallest upset can make the rider wobble. An overcorrection can invite disaster.
The engineers who designed the power grid know this and have built a tremendous amount of redundancy into the grid. On the grid, if a single plate falls and shatters, others are there to take its place. This redundancy makes our electricity supply remarkably reliable.
The behavior of the grid is, in its own way, an emergent one, meaning it arises from the interactions of many parts. But it’s different from the collective behaviors of bee colonies or flocks of birds—in those, order seems to arise without the central control of a single decision-maker. Members of the hive or flock, each following simple, programmed instructions, bring about self-organized, often surprising group-scale behaviors like the undulating beauty of a murmuration of starlings.
The power grid does have some ways of reinforcing its own stability. Because the elements in the power grid are so tightly connected, like gears in a machine, they exert a lot of force on one another. If one element becomes a little unstable, the others can compensate.
But there are limits to the grid’s ability to regulate itself. Like many other complex systems in which self-reinforcing anomalies can amplify themselves into powerful forces, the grid can destabilize itself to the point of destruction. If one element becomes unstable, it can trigger feedbacks that prompt the rest of the grid to over-compensate. The over-compensating equipment then becomes unstable, and the system destroys itself.
Thus, as with many complex systems created by humans, such as financial markets, the power grid needs people in the loop to control it and ensure that it operates in a stable manner. But also like financial markets, the modern grid confounds even its most highly skilled operators.
Power grid forensics is hard
In 1997, a major power failure in the Western U.S. affected many millions of people. After months of study, engineers traced the problem to a piece of faulty equipment in southwestern Wyoming. Its failure overloaded two nearby pieces of equipment, so those failed as well, creating a kind of domino effect.
Then something strange happened. The failures of the first three pieces of equipment triggered the failure of a fourth—this one nearly 800 miles away on the Oregon-Idaho border. This failure was followed by a fifth at a substation in southwestern Montana, 400 miles to the east.
The chain of cascading failures played hopscotch throughout the Western states, taking out nearly 30 pieces of critical power equipment until the slide was finally arrested somewhere in Nevada.
There are good reasons each of these pieces of equipment failed. But why the failures jumped around the way they did remains a mystery. Understanding the patterns of failures in cascading blackouts continues to be a challenge for power system operators and for scientists. No existing theory predicts or explains these patterns. The industry is learning more, but we are a long way from having the types of warning systems that are in place for earthquakes, tsunamis, and other disasters.
Are people the problem or the solution?
The Northeast Blackout of 2003 was, in part, instigated by the relatively mundane circumstance of a tree getting in the way of a sagging power line. As the blackout cascaded eastward, my own state of Pennsylvania was largely spared. We escaped the brunt of the blackout while surrounding states did not because the company operating my power grid made a spot decision to sever ties with the grids of neighboring states.
Which is to say that people and the decisions they make are a big contributor to the power grid’s “robust fragility.” We are the grid’s failsafe and its deepest weakness.
How? The ways people make investment and operations decisions about the power grid reflect two well-known dynamics in complex systems: preventing small localized problems can make future big problems worse, and competition can be more costly than cooperation.
We know from ecology that relentlessly preventing small problems can increase the risk of future big problems. For nearly a century, for example, the policy of the U.S. Forest Service was to prevent all fires and, when that failed, to fight fires that did start early and aggressively. This was effective at preventing small and medium-sized fires and for protecting property. But it left behind fuel, which accumulated, turning future small fires into much bigger problems.
We have managed the power grid in much the same way. The small number of grids that serve North America are managed by a large number of utility companies, each of which is responsible for a small portion of the overall infrastructure.
Those companies all make localized and quite rational decisions to reduce the frequency of outages within their individual operating footprints, which in practice means making investments to avoid relatively small blackouts in specific areas. This piecemeal approach, however, increases the danger of larger cascading failures with the potential to affect millions of people.
An illustration of the second effect can be seen in the collective behavior of animal herds. Caribou migrating across dangerous ice floes, for example, somehow can make good collective decisions if individuals are all striving towards the same goal—that is, if they cooperate. Similarly, computer simulations of the power grid by my colleagues Paul Hines and Sarosh Talukdar suggest that cooperative decision-making could keep blackouts from spreading if portions of the grid “island” themselves—if individual utilities effectively fall on their swords to save the grid as a whole. The process could be automated, as long as all the equipment on the grid was programmed with the same instructions in a sort of grand cooperative bargain.
But the reality of operating a power grid is very different. Under current policy for much of the U.S., the utility companies that would need to cooperate for such an arrangement to work are in direct economic competition with one another for electricity sales. And even if they weren’t, individual utilities would still need to answer to state regulators for their own actions.
Some analyses have suggested that the 2003 blackout would not have spread so far had one utility company chosen to let the city of Cleveland go dark. In trying to keep the grid up to rescue Cleveland, that utility promulgated an unstable situation that spread to several other states and Canada.
And yet, imagine if those same utility executives had made a different decision, one that sacrificed Cleveland in the hopes of a greater good for the Northeastern power grid. I would not want to be in the room when they had to explain to regulators in Ohio why they let some 20 percent of the state’s population sit in the dark without air conditioning in the dead of summer just to save New Yorkers from that fate.
An integrated human-technological system
This collision of imperfect people, policy, and machines is why the power grid has fascinated complexity scientists for some time and has remained one of modern society’s biggest challenges. Is the grid just a complicated system at massive scale whose modeling is made difficult by human interaction, or is it a technological manifestation of society’s energy preferences?
Ultimately, as the 2003 blackout illustrates, it is both. The power grid is a collection of a large number of engineered devices, each of which has been programmed (like ants) to follow certain instructions, leading to highly coordinated collective behaviors. Those instructions, however, are the outcomes of social deliberations that range from the technocratic to the democratic.
So the grid is not just a massively complex infrastructure; it is one of the best examples we have of a truly integrated human-technological system.
This already-complex system is getting more complex by the day, as people decide to put solar panels on their rooftops, buy electric cars, or abandon the grid entirely. Are these preferences and decisions, which spread through communities much as diseases do, good or bad for the grid as a whole? The unfortunate answer at this point is that we don’t know until we know—these are all (new) emergent behaviors whose ultimate outcomes are in play.
A great deal of progress has been made in recent years in using large amounts of data from the power grid to detect problems and, hopefully, solve them in real time. While these are stupendous capabilities, fundamentally they grapple with yesterday’s grid. Our most urgent challenge is not to get better at describing what has happened, but to understand whether it is possible to harness the complexity of the future power grid so it behaves more like an orderly bee colony and a lot less like a circus unicyclist.
Seth Blumsack, a Santa Fe Institute external professor, is an associate professor in the Leone Family Department of Energy and Mineral Engineering and director of the program in Energy Business and Finance at Penn State University. His research addresses policy-relevant problems related to energy and environmental systems and technologies.