The number of words that will be written about the CrowdStrike incident in the coming weeks and months will undoubtedly number in the millions (and that’s before we get to all the content that will be stolen and shamelessly reposted by the regular content thieves).
I don’t want to talk about the CrowdStrike incident itself, but it has made me reflect a little on disaster preparedness as a concept, and I want to suggest a simple yet important differentiator that can too often be missed. (I also want to say: I’m not suggesting this is the problem being encountered by businesses being affected by this specific incident.)
If your business continuity/disaster recovery plans assume you have systems access, they’re incomplete.
I’ll say that again:
If your business continuity/disaster recovery plans assume you have systems access, they’re incomplete.
Now, that’s not to say you can’t have a variety of business continuity/disaster recovery (BC/DR) plans that do assume some level of systems access – that’s what dependency mapping is for, after all. But, there is a kernel of BC/DR planning that has to assume a need to bootstrap your environment. And that means assuming that everything in your environment is lost or inaccessible. Everything.
- DNS.
- Identity/authentication servers.
- Password management systems.
- Log servers.
- SIEM systems.
- License servers.
- Jump hosts.
- Backup servers.
Not just all the primary production systems, but every IT support function you run. And you must have plans for “how shall I bootstrap the fundamentals of my environment?” if you’re going to have a complete BC/DR plan.
Back in the 90s, we had a fireproof safe near our datacentre. And in that safe we kept printouts – yes, printouts – of all system configuration information; such information was reprinted every month or so for updated storage. Emergency password details were stored. Printouts of all DNS entries were stored. Everything you might possibly need to bootstrap an environment from scratch was there, on paper, immutably separate from the IT systems that ran the environment.
Undoubtedly such descriptions would give many modern-day security experts the heebie-jeebies, but I’m going to suggest something perhaps slightly controversial here: there reaches a point where best practice security for operational purposes – even conventional recovery purposes – can become an impediment to BC/DR.
I want to stress here that I’m not advocating for the abandonment of sensible security practices just because you want to make BC/DR easier. I am however suggesting that sensible security practices for bootstrap BC/DR are necessarily different from sensible security practices for regular operations and regular recovery processes.
There’s a certain “look and laugh” in security circles about password books, yet any Gen-X child who has had to remote-walk a technology-averse Boomer parent through password management will know the potential benefit of such options. (“Dad, go grab the password book out from behind the utensils drawer in the kitchen” is a hell of a lot easier instructional conversation to navigate than explaining the ins and outs of Keychain Access, for instance.)
Bootstrap BC/DR requires an adaption of the regular approach to information systems security – instead of, “how do we do these things within the bounds of our security constraints?”, we must instead ask, “how can we best secure these things that we must do?” There’s a subtle yet important difference; the former assumes the security requirements are inviolable; the latter keeps security as important, but makes enablement of recovery the primary process, with security assisting within the bounds of the recovery requirements. (One might say that this shifts security from having a primary focus on electronic considerations to physical considerations.)
That said, there are ways to properly secure your bootstrap BC/DR. Cyber Vaults for instance can enable it – indeed, a key part of the prep for a Cyber Vault is to focus on the Bootstrap BC/DR elements that need to be protected. Beyond that, immutability in your backup and recovery systems are critical, of course – but everything you plan has to start with “what if we don’t have access?” I.e., your backups might be secure, but if you have tightly integrated authentication and your authentication systems are down, what’s your “break glass in case of emergency” strategy?
The only way you can build a resilient bootstrap BC/DR process is to make sure that the zeroth assumption in the process is, “All conventional systems access methods are unavailable”.
And that’s the core of disaster preparedness.
[Edit: An earlier version of the article repeatedly typo’d CrowdStrike as CloudStrike. Muscle memory and touch typing can be a great combo for repeating errors. Thanks for the quick spot, Boris.]
Hello,
just a small glitch in this high quality post, which I’m really grateful for – the company is called CrowdStrike, not CloudStrike. The content itself is brilliant, right to the point and I wish every single IT admin and IT manager made it their mantra.
All the Best!
Bo.