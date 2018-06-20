June 20, 2018

In my initial post on brittleness I suggested that a typical process is:

Build something brittle.

Strengthen it over time.

In many engineering scenarios, a fuller description could be:

Design something that works in the base cases.

Anticipate edge cases and sources of error, and design for them too.

Implement the design.

Discover which edge cases and error sources you failed to consider.

Improve your product to handle them too.

Repeat as needed.

So it’s necesseary to understand what is or isn’t likely to go wrong. Unfortunately, that need isn’t always met.

Murphy’s Law and exaggerated fears

We should always bear in mind Murphy’s Law, which in its simplest form states: Anything that can go wrong, will. But also remember that Murphy’s Law is a joke; and even if it were serious, nothing concise is ever precise.

People who tend to over-believe in Murphy’s Law include but are hardly limited to:

Bureaucrats.

Worried parents, especially of only children. (Later kids tend to have it easier, as their parents have more experience.)

Any buyer or voter you believe has been over-persuaded toward fear, uncertainty and doubt.

Relational bigots who view the Ted Codd guarantee as an absolute requirement for data management.

Adversaries

The strongest scenarios for Murphy’s Law should be adversarial ones, in which somebody is actively trying to cause problems. But even there it doesn’t always apply. For example:

Information security commonly fits the Murphy model. Hackers keep outwitting defenders.

commonly fits the Murphy model. Hackers keep outwitting defenders. Email spam, however, does not. It’s pretty much of a solved problem; the few spam emails that still get through hardly matter.

however, does not. It’s pretty much of a solved problem; the few spam emails that still get through hardly matter. Web search is somewhere in between. Both sides are partially successful in the combat over adversarial information retrieval, as “good” and “bad” sites alike are both well-represented in search results.

Single-impetus failures

Since bad or scary things will happen — Murphy’s Law isn’t entirely wrong — a standard design practice is to avoid single points of failure. Brittleness has a lot to do with which single points of failure have been overlooked; improvement has a lot to do with belatedly cleaning them up. In adversarial scenarios, avoiding single points of failure relates closely to defense in depth.

Some of the nastiest surprises occur when failures have no obvious single point, yet wind up being possible from a single impetus.* This happens when multiple points or moments of failure are somehow correlated, or when they actually cascade. Examples vary widely, including:

The collapse of the World Trade Center buildings.

An authoritarian leader who manages to destroy a whole democratic system of government.

IT examples that are relatively big deals include:

Security breaches in which an attacker becomes able to fully impersonate a well-credentialed user.

Power outages or other whole-building breakdowns that bring down all parts of a (locally) redundant cluster.

Software bugs that bring down all parts of a supposedly redundant system at once.

Analytic failures that stem from misleading data sets. (Garbage in, garbage out.)

*I chose the phrase “single impetus” rather than “single cause” because NOTHING has a truly single cause; things only can happen when all kinds of conditions are satisfied for them to succeed. But there can indeed be an identifiable force, plan or occurrence that sets a chain of events in motion, and that’s what I’m calling the “impetus”.

Related link

A lot of analytics turns out to be adversarial.

Comments