Sandpits in enterprise data protection

“Practice makes perfect” is not just a common saying, it says in three short words how humans learn to do things. You could sit and read as many books as you want to on how to play the clarinet, but until you pick one up, hold it and blow into the mouth piece, you won’t have a chance of becoming proficient.

Studying
Studying

Many businesses have ‘sandpit’ areas for core business functions – SAP, Oracle, SQL, even sometimes SAN and NAS, but it’s still relatively rare to see businesses allocate resources to data protection, and particularly backup/recovery sandpits.

My question on this is – why?

The reason we see database teams in particular having sandpit areas is that the business functions they support are often mission-critical, or generate significant value to the business. If they make a mistake, or don’t get to test something before it goes into production, the results could be catastrophic for the business.

Is that very much different from a backup administrator though? Let’s think of a few errors I’ve seen backup administrators make over the years:

  • Deleted all backups newer than 3 months, rather than older than 3 months
  • Recovered a Linux root filesystem over a Solaris root filesystem (I’ll confess: I did this. And it was the backup server’s root filesystem I did the recovery over.)
  • Configured notifications to ignore all warnings about open files, generating an unrecoverable backup
  • Deployed backup on new operating systems or applications without any testing whatsoever
  • Performed a large-step upgrade on Data Domain leaving NetWorker with incompatible libraries, forcing a rapid upgrade roll-out (e.g., in one day jumped from DDOS 5.4 to DDOS 6, without understanding the Boost library implications)
  • And more generally, assumed too much.

There’s a simple point I’m trying to make here: if the business believes that a database administrator, for instance, who makes an untested change to production can cause a significant business impact, what’s the risk of a failure when working with backups – that touches so much of the business – and doing something untested that goes wrong?

Arguments I’ve heard over the years on why we shouldn’t have sandpits include:

  • You should know our environment well enough to advise if changes will have an impact on us:
    • That’s not only presumptuous, it’s also side-stepping the issue.
  • Enterprise systems should not need sandpits:
    • So take the sandpits away from everyone else in the organisation then.
  • It’s only backup:
    • That’s like the knight in Monty Python’s The Holy Grail shouting, “It’s only a flesh wound!” Backup isn’t only backup – it’s critical IT insurance.
  • That’s what support services are for:
    • Do you drive without a seatbelt too, because that’s what paramedics are for?
  • We can’t afford the licensing:
    • You don’t have to backup the world in a sandpit. A 5 Socket DPS for Virtual Machines license, or a 1TB DPS for Backup license would likely give you more than enough for a sandpit.
  • Hardware is expensive:
    • What’s the cost of a significant failure from not testing adequately before rolling into production? And unless we’re talking tape drives, we can go software-defined on almost everything.

I’m not trying to be flippant in the above, but I’ll be honest – after 22 or so years working in data protection, I’ve heard all the reasons to not provide a sandpit to the backup environment, and I’ve never yet been convinced on a single one. Backup is important, and needs to be treated as such.

The same applies to other data protection options that the business uses. If you make use of snapshots and replication, you should have some way of testing them; if you make use of CDP technology (e.g., RecoverPoint for Virtual Machines), you should have test protocols. And so on. It’s not rocket science. Or maybe it is – if you do it wrong, it can be pretty serious, so why not invest in helping to ensure it’s done right?

(And if someone tells you that you don’t need a sandpit, they’re being cavalier with your data, and encouraging you to take bad risks.)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.