In the past I’ve talked about the importance of having zero error policies.
In “What is a zero error policy?“, I said:
Having a zero error policy requires the following three rules:
1. All errors shall be known.
2. All errors shall be resolved.
3. No error shall be allowed to continue to occur indefinitely.
If you’ve not read that article, I suggest you go read it, as well as the follow-up article, “Zero error policy management“.
I’m going to make, and stand by, with fervid determination, the following assertion:
If you do not have a zero-error policy for your backups, you do not have a valid backup system.
No ifs, no buts, no maybes, no exceptions.
Why? Because why. Because across all the sites I’ve seen, regardless of size, regardless of complexity, the only ones that actually work properly are those where every error is captured, identified, and dealt with. Only those sites would I point at and say “They have every chance of meeting their SLAs”.
In my book, I introduce the notion that just deploying software and thinking you have a backup system is like making a sacrifice to a volcano. So, without a zero error policy, what does a network diagram of your IT environment look like?
It looks like this: