Why tape and dedupe just don’t mix

In “Tape and dedupe: So not happening • The Register“, Chris Mellor asks:

Why haven’t more vendors followed CommVault in putting deduped data on tape? Is it technically too hard?

There’s a good reason for this – it’s pretty nutty. Whether we wish it or not, tape is designed for large scale high speed sequential access. Dedupe requires high speed random access in order to rehydrate. Some time ago I wrote a rebuttal to Curtis Preston’s overly generous appraisal of CommVault’s dedupe to tape strategy, and I still stand by every word I wrote there.

To be fair, Chris quotes someone who puts the argument very succinctly:

Steve Mackey, SpectraLogic’s sales veep for Europe and Africa, says: “The issue of dedupe is recovery. You’ve got to recover the whole tape or a set of tapes before you can recover a file. The big users of archive are looking to recover the data. Today I don’t believe dedupe on tape meets the requirements for recovery.”

Dedupe to tape is crazy. Unless we can somehow overcome the sequential access nature of tape, it will stay crazy, too. That’s why tape and dedupe generally isn’t happening.

And I’m glad that’s the case.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.