Meme Monday

Let’s start the week off by thinking of a few data protection topics through the awesome window of memes, shall we?

You can’t protect data if you don’t know where it is

I’m a big fan of taking the time to do data discovery. While data gravity means you’ll get a lot of data in the business clumping together, there’s always a variety of outlying data in a business. In fact, as “edge” continues to garner attention, there’s going to be a lot more outlying data in business that we have to deal with.

But you can’t deal with data if you don’t know where it is. This is why every enterprise needs to have an IT architect dedicated to data protection: someone whose job is to find those chunks of data that aren’t just sitting all neat and tidy in a datacentre.

Dedupe so good you’d think aliens invented it

I find it pretty funny watching other vendors run around histrionically shrieking things like, “dedupe is dedupe is dedupe!”, or “dedupe isn’t important!”

Of course, dedupe is important: it’s the exemplar of the mantra that has been at the heart of the IT department for the last decade: “do more, with less”.

And no: dedupe-isn’t-dedupe-isn’t-dedupe. You’ve got vendors out there that dedupe using such large segment sizes that you genuinely wonder why they even bother. Then you’ve got others who just simply compress each incoming data stream and stick a “dedupe” badge on their product. Wow! 3:1 ‘dedupe’! I dunno, maybe they get an award for trying or something?

I’ve had conversations where the reaction to Data Domain deduplication, on the other hand, is basically, “Witchcraft!” Well it’s not witchcraft, and it’s not aliens – it’s just some very intelligent algorithms that are designed from the ground up to squeeze as much as possible out of your data.

Glacier storage is only cheap if you don’t use it

Here’s my honest tip when it comes to data protection: Glacier (and anything of its ilk) is not designed for data protection. Sure it’s cheap (to start with) to stuff data into it, but it’s sure as hell not cheap (or quick) to get data out. What’s more, the access process for glacier style cloud storage is completely at odds with deduplication, so if you are pushing data into there, you’ll just be compressing it. You know, like a tape with compression turned on…

So you can store 500TB in Glacier at 2:1 compression, or 500TB in S3-IA at 30:1 deduplication. What’s cheaper, month on month? 250TB of Glacier or 17TB of S3-IA? If I look at the AWS storage calculator, 17TB of S3-IA in US-East is $217, or 250TB of Glacier in the same location is $1,126. And that’s before you start considering any data retrieval time or cost. Now, start adding another 500TB front end month on month to retain for 7 years.

Glacier. Talk about a way to become a data hostage.

Doing dump and sweep and running out of storage?

Dump and sweep. It sounds good in theory until you get to the point where for every 1TB of database storage, you’re consuming another 3-5TB in primary storage to hold the dumps – and then, if the DBAs ‘save space’ by compressing those dumps, you get the added benefit of blowing out your data protection storage when using deduplication storage or compressing tape.

Dump and sweep had its place, but there are better ways to have database administrators retain control of their backup processes while achieving good storage efficiency and achieving the sorts of compliance oversight we look for in data protection. If you’re not convinced, try out the Data Domain Boost plugins, with DBAs writing their backups directly to market-leading deduplication storage:

  • Tools integrated into their own products – check
  • High-speed backup – check
  • High-speed recovery – check
  • Freeing up primary storage – check
  • Maximising deduplication efficiency – check
  • Reducing network bandwidth requirements – check
  • Minimising data protection storage usage – check

I mean, what isn’t to love about throwing “dump and sweep” out?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.