In part one of this series, I wrote about my research that lead to the Framework for Analyzing Organizational Failure. Since I created the Framework back in 2005, I have seen it validated in a number of organizational failures. So, in 2010, I started work on expanding the paper into a book. During the course of my research, one of the leading thinkers in the field of failure analysis published a book updating many of his theories that I used in creating the Framework. The basic components of the Framework still hold but his concept of drift have led me to envisioning a Framework 2.0.
Dr. Sidney Dekker has written many influential books on failure analysis and has held several international teaching positions. His latest book, Drift into Failure (2011), is both a reflection of his past research and how complexity theory has created a need for new way of analyzing failure. His main argument is simple to understand: our organizations and technology have become increasingly complex but our understanding of why things fail don’t reflect that complexity (p. 7).
We are victims of a worldview in which we assume that people make rational choices, that every cause has a clear and direct effect, and that failures happen because a “broken component” in the system and/or an irrational decision. In our hunt for the cause of failure, we look for the “bad actor” that broke the component in the system (Dekker, 2011, p. 3).
This worldview is dangerous because it blinds us to the complexity of organizations and technologies while leading us on a chase for someone to blame. Think of Enron, the BP Gulf Disaster, and the 2008 mortgage meltdowns. The news was full of experts pointing their fingers at executives, brokers, buyers, and practices in the industry, whatever all in a quest to find the bad actor who broke the part that led to the collapse of the entire system. Once we THINK we have found the bad actor/broken part then we have fixed the problem. And then the next oil spill happens, another firm defrauds the public, or we face another financial crisis.
This is using hindsight for foresight and that never works. In examining the BP Gulf Disaster and Enron, Dr. Dekker demonstrates that the decisions made locally by actors given the knowledge that they had at the time were rational decisions. Yes, there was cutting of corners but these were such small impacts, how could they affect such a large system as BP which has thousands of employees and oil wells? As I explained in part one, these decisions can lead to latent conditions that accumulate and erode the system to the point that it takes one small accident that reverberates throughout the system triggers a chain of increasingly larger failures.
This is called the normalization of deviance and it was this very practice that led to the destruction of the Space Shuttle Challenger and the Space Shuttle Columbia. From the very first flight, there has been damage to the O-rings and there has been damage from foam strikes. Even so, NASA would just continually increase the tolerance for the damage so that they can continue to fly the shuttles.
Normalization of deviance is just one symptom of drift. According to Dr. Dekker, there are five features in drift:
1) Uncertainties in the environment, scarcity of resources, and pressures to produce lead to organizations making decisions to sacrifice some minor safety concerns. During the BP drilling that led to the disaster, oil rig workers would do “good enough” tasks just so they can meet the tight production deadlines. Each of these shortcuts were very minor but. . .
2) Drift occurs in small steps. A little shortcut here and a little shortcut there adds up in making the system more vulnerable to accidents.
3) Despite the large number of interacting components and size of systems, these complex systems are very sensitive to initial conditions. This is all due to path dependency. Choosing a particular software platform gives me some advantages but I am also locked in by the limitations of that platform. Thus, the choice of which radio system to use by the New York Police Department and the New York Fire Department had a profound effect on rescue operations during 9/11.
4) Unruly technology. Think of it this way: we know how to make aircraft that fly. But only person has actually figured out how to make a medium-size airline profitable. Our technology is just not limited to the mechanical and computational but also includes social. We cannot comprehend fully how our technologies interact with each other and the effects their interactions have.
5) Complex systems often capture that protective structure that is supposed to keep them from failing. Again, BP Gulf Disaster provides a great example in that the government agency designed to oversee offshore oil drilling was compromised because of the lucrative practice of regulators becoming lobbyists for the very companies they were supposed to oversee.
Dr. Dekker closes his book with two warnings. One, complexity is inevitable and thus we need to learn how to manage/prevent failure in complex systems. Two, our current worldview of bad actors breaking components is blinding us to real underlying causes for failure in complex systems. In my final post in this series, I will outline a new theory on dealing with failure in complex systems.
Disclaimer: All opinions are mine and do not reflect the views or opinions of my employers or any organizations I belong to and should not be construed as such.
Dekker, S. (2011). Drift into failure: From hunting broken components to understanding complex systems. Burlington, VT: Ashgate Publishing Company.