Outage vs Outrage

There’s only a minor difference between outage and outrage, at least from a spelling perspective. Just a single r.

I was reminded of this differentiation while reading about the identified cause of a recent 4-hour outage across the entire Sydney Metropolitan train network:

Transport for NSW believes a failed network switch caused yesterday’s hour-long communications outage, compounded by the system’s failure to automatically switch to a backup network.

Failed switch caused Sydney Trains network outage – Richard Chirgwin, 9 March 2023, IT News.

Years ago when I lived in NSW, I experienced a few of these sorts of major network outages; one of the most memorable being a 7 hour trip home during utter network chaos – one that normally would have only taken me 1.5 hours.

If there’s one thing I’ve noticed over the years, it’s that there really is a fine line between something being an outage and being an outrage. And it’s all there in the r difference. It’s robustness.

You see, robust systems are not ones that are immune to failure — they’re not some sort of unicorn, phantasmagorical thing or a wild and wacky fictional rare item like unobtainium or adamantium. (Though sometimes robust systems can seem that rare.)

Instead, robust systems are built from the ground up on a paranoid predilection towards failure. The difference though is that they’re built to endure such failure events – and they’re built to endure because the builders of robust systems have anticipated as many failure situations as possible. These are failure tolerant systems not because they’re failure-averse, but because every step along the way is built around the question, “what do we do if this bit encounters a failure?”

By and large, the more users a system services, the more it needs to be built with extreme robustness so that outages don’t become outrages. And there will still be outages — but the goal should be for those outages to last seconds or minutes, not hours or days. For an outage to last such a short time (particularly for complex systems), that requires extreme robustness, not just for the system itself, but also for the recovery processes for the system.

The difference between outage and outrage is a good one to hold in mind while planning systems. You have to anticipate the former, but you need to make sure the former doesn’t become the latter. And the only way to do that is to apply a laser focus to the missing r: the robustness.

1 thought on “Outage vs Outrage”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.