Martin Glassborow, aka @storagebod, and I had a bit of a discussion via Twitter, which came down to the following:
- Martin feels the default backup policy within an environment should be to backup nothing;
- I feel the default backup policy within an environment should be to backup everything.
Now the interesting thing is, we both actually meet in the middle, but just start from different points.
Martin has discussed his reasoning behind his default policy here, in “Don’t BackUp“, which I encourage you to read before continuing. There is, indeed, as Martin suggested in a tweet to me last night, a nice absolutism in either approach – don’t backup, or backup everything. Yet, neither is really the case.
My approach – that being to start with “backup everything”, starts with the following assumptions:
- Hardware can fail.
- Software can fail.
- Humans can make errors.
- Processes can fail.
By my very nature I think I’m perfectly suited to working in the backup space. I’ve always been into backup. On the Vic-20, when I was learning to program, I’d always save my programs onto two different tapes. On the Commodore 64, I’d always save my programs and documents onto two different disks. When I went to the PC, I’d always have a copy on a hard drive, and a copy on a floppy drive.
Martin’s approach is this:
Making it policy that nothing gets backed-up unless requested takes out all ambiguity. There can be no assumptions about what is being backed-up, it makes it someone’s responsibility as opposed to an assumed default.
There is, undoubtedly, logic in what Martin suggests, but it’s not a logical starting point I can personally reconcile myself with, for the fundamental reason that it (IMHO) assumes that everyone who interacts with the system understands the system and the nature of their interaction.
It in fact runs completely contrary to an axiom in user desktop/laptop backup approaches – if you leave backups up to the users, nothing will get backed up. That holds true for pretty much every business I’ve ever interacted with, from the most, to the least technical.
It’s for that reason, that lack of total systems awareness and data responsibility from all users of any environment, that my approach starts from the other end. Backup everything.
But I don’t really mean it. I abhor wastage. Recently, I’ve learnt that wastage comes in many forms, which is why the decision to move interstate and re-evaluate what I/we own has been cleansing. (See the article “deconstruction of falling stars” over at my personal blog for a bit more on that front.)
As I abhor wastage, I don’t actually believe you should backup everything within your environment. Sure, some vendors might like that notion – infinite tapes, disk, storage, snapshots, you name it. But it’s neither practical nor commercial reality to do this.
No, there is a middle ground. For me, the sweet spot is this what I always come back to:
It is always better to backup a little more than you need, and waste some storage media, than it is to not backup quite enough, and be unable to recover.
So if your tape usage is say, 5-10% higher than it should be, or your VTL/B2D environment is 5-10% bigger than it really needs to be, I’m not concerned. (If it’s a crazy amount, like 100% more, then there’s a problem – a serious problem that has arisen from a lack of capacity planning, etc.)
I’ve seen IT sites where NetWorker agents have been deployed on every server within the environment, and when I’ve done a coverage analysis, I’ve seen servers that have this as the saveset:
/etc/hosts
Just that. Nothing more, nothing less. (You couldn’t get much less anyway.) I’ve equally seen sites where not only was a hot backup done of the production Oracle database via a module, but the database files were backed up as part of the filesystem backup, and then export/dumps were generated and backed up as well. Overkill? Yes. Were some backups unrecoverable? Yes.
Both are very clear examples of wastage, but I’ll tell you the difference.
The latter one – backing up too much, is time and money wastage. Neither are pleasant, both can hurt the bottom line of a company, yet that’s where it stops.
The former – backing up only what is explicitly requested, nothing more, is corporate wastage. There’s a little bit of monetary wastage involved (why spend the money on an agent to backup a single file?) – the real wastage though is that it could waste the company. Unable to recover legally required files because someone forgot to request them to be backed up? Hello, lawsuit loss. Unable to recover financial data that proves your company has correctly paid its taxes because someone forgot to request them to be backed up? Hello, double tax payments. For me it triggers thought of every possible nightmare scenario a company might experience, right through to total dissolution and loss of the company itself.
In my book, I make the differentiation between what I call inclusive and exclusive backup products. I define:
- An inclusive backup product is one where you have to explicitly specify what gets backed up. By default, nothing is backed up unless you specify it.
- An exclusive backup product is one where you have to explicitly specify what doesn’t get backed up. By default, everything is selected and you have to winnow that selection down yourself.
The first, I consider to be the hallmark of a workgroup backup product approach. Cost reduction is the primary focus of this approach. The second, I consider to be a fundamental requirement for a product to earn the “enterprise backup product” badge of honour. Without this, there is a distinct lack of trust.
While I can understand Martin’s starting point, and that he moves more to the middle of making sure the right things are backed up, I can’t agree with this logic that this is the best approach.
I’ve seen, heard of, and witnessed too many IT war stories.
Sorry- not with you on this one. What’s backed up on a server should not be in the end-users’ hands, but rather in the syadmin’s. The system administrator should, as part of the server install, provide his backup requirements. These requirements should arise naturally from recovery planning. Any sysadmin who says “back up everything” is either lazy or doesn’t understand his system.
Well, I actually agree with you on everything you’ve said there simply because that’s what my main argument was.
My take is that if you don’t have a starting point for determining what to backup, you start by talking about backing up “everything”. You should never however implement a “backup everything” approach – that would indeed imply laziness. But it’s a case of stripping out what you don’t need to backup, rather than adding in what you do need to backup.
And that’s why i NetWorker you cannot say ‘backup everything except /oradata*’ – as long as /oradata* are filesystems….