Jan 242017
 

In 2013 I undertook the endeavour to revisit some of the topics from my first book, “Enterprise Systems Backup and Recovery: A Corporate Insurance Policy”, and expand it based on the changes that had happened in the industry since the publication of the original in 2008.

A lot had happened since that time. At the point I was writing my first book, deduplication was an emerging trend, but tape was still entrenched in the datacentre. While backup to disk was an increasingly common scenario, it was (for the most part) mainly used as a staging activity (“disk to disk to tape”), and backup to disk use was either dumb filesystems or Virtual Tape Libraries (VTL).

The Cloud, seemingly ubiquitous now, was still emerging. Many (myself included) struggled to see how the Cloud was any different from outsourcing with a bit of someone else’s hardware thrown in. Now, core tenets of Cloud computing that made it so popular (e.g., agility and scaleability) have been well and truly adopted as essential tenets of the modern datacentre, as well. Indeed, for on-premises IT to compete against Cloud, on-premises IT has increasingly focused on delivering a private-Cloud or hybrid-Cloud experience to their businesses.

When I started as a Unix System Administrator in 1996, at least in Australia, SANs were relatively new. In fact, I remember around 1998 or 1999 having a couple of sales executives from this company called EMC come in to talk about their Symmetrix arrays. At the time the datacentre I worked in was mostly DAS with a little JBOD and just the start of very, very basic SANs.

When I was writing my first book the pinnacle of storage performance was the 15,000 RPM drive, and flash memory storage was something you (primarily) used in digital cameras only, with storage capacities measured in the hundreds of megabytes more than gigabytes (or now, terabytes).

When the first book was published, x86 virtualisation was well and truly growing into the datacentre, but traditional Unix platforms were still heavily used. Their decline and fall started when Oracle acquired Sun and killed low-cost Unix, with Linux and Windows gaining the ascendency – with virtualisation a significant driving force by adding an economy of scale that couldn’t be found in the old model. (Ironically, it had been found in an older model – the mainframe. Guess what folks, mainframe won.)

When the first book was published, we were still thinking of silo-like infrastructure within IT. Networking, compute, storage, security and data protection all as seperate functions – separately administered functions. But business, having spent a decade or two hammering into IT the need for governance and process, became hamstrung by IT governance and process and needed things done faster, cheaper, more efficiently. Cloud was one approach – hyperconvergence in particular was another: switch to a more commodity, unit-based approach, using software to virtualise and automate everything.

Where are we now?

Cloud. Virtualisation. Big Data. Converged and hyperconverged systems. Automation everywhere (guess what? Unix system administrators won, too). The need to drive costs down – IT is no longer allowed to be a sunk cost for the business, but has to deliver innovation and for many businesses, profit too. Flash systems are now offering significantly more IOPs than a traditional array could – Dell EMC for instance can now drop a 5RU system into your datacentre capable of delivering 10,000,000+ IOPs. To achieve ten million IOPs on a traditional spinning-disk array you’d need … I don’t even want to think about how many disks, rack units, racks and kilowatts of power you’d need.

The old model of backup and recovery can’t cut it in the modern environment.

The old model of backup and recovery is dead. Sort of. It’s dead as a standalone topic. When we plan or think about data protection any more, we don’t have the luxury of thinking of backup and recovery alone. We need holistic data protection strategies and a whole-of-infrastructure approach to achieving data continuity.

And that, my friends, is where Data Protection: Ensuring Data Availability is born from. It’s not just backup and recovery any more. It’s not just replication and snapshots, or continuous data protection. It’s all the technology married with business awareness, data lifecycle management and the recognition that Professor Moody in Harry Potter was right, too: “constant vigilance!”

Data Protection: Ensuring Data Availability

This isn’t a book about just backup and recovery because that’s just not enough any more. You need other data protection functions deployed holistically with a business focus and an eye on data management in order to truly have an effective data protection strategy for your business.

To give you an idea of the topics I’m covering in this book, here’s the chapter list:

  1. Introduction
  2. Contextualizing Data Protection
  3. Data Lifecycle
  4. Elements of a Protection System
  5. IT Governance and Data Protection
  6. Monitoring and Reporting
  7. Business Continuity
  8. Data Discovery
  9. Continuous Availability and Replication
  10. Snapshots
  11. Backup and Recovery
  12. The Cloud
  13. Deduplication
  14. Protecting Virtual Infrastructure
  15. Big Data
  16. Data Storage Protection
  17. Tape
  18. Converged Infrastructure
  19. Data Protection Service Catalogues
  20. Holistic Data Protection Strategies
  21. Data Recovery
  22. Choosing Protection Infrastructure
  23. The Impact of Flash on Data Protection
  24. In Closing

There’s a lot there – you’ll see the first eight chapters are not about technology, and for a good reason: you must have a grasp on the other bits before you can start considering everything else, otherwise you’re just doing point-solutions, and eventually just doing point-solutions will cost you more in time, money and risk than they give you in return.

I’m pleased to say that Data Protection: Ensuring Data Availability is released next month. You can find out more and order direct from the publisher, CRC Press, or order from Amazon, too. I hope you find it enjoyable.

Sep 232015
 

The LTO consortium has announced:

That the LTO Ultrium format generation 7 specifications are now available for licensing by storage mechanism and media manufacturers.

LTO-7 will feature tape capacities of up to 15TB (compressed) and streaming speeds of up to 750MB/s (compressed). LTO is now working off a 2.5:1 compression ratio – so those numbers are (uncompressed) 6TB and 300MB/s.

Don’t get me wrong – I’m not going to launch into a tape is dead article here. Clearly it’s not dead.

That rocket car is impressive. It hit 1,033KM/h – Mach 9.4* – over a 16KM track. There’s no denying it’s fast. There’s also no denying that you couldn’t just grab it and use it to commute to work. And if you could commute to work using it but there happened to be a small pebble on the track, what would happen?

I do look at LTO increasingly and find myself asking … how relevant is it for average businesses? It’s fast and it has high capacity – and this is increasing with the LTO-7 format. Like the rocket car above though, it’s impressive as long as you only want to go in one direction and you don’t hit any bumps.

Back when Tape Was King, a new tape format meant a general rush on storage refresh towards the new tape technology in order to get optimum speed and capacity for a hungry backup environment. And backup environments are still hungry for capacity and speed, but they’re also hungry for flexibility, something that’s not as well provided by tape. Except in very particular conditions, tape is no longer seen as the optimum first landing zone for backup data – and increasingly, it’s not being seen as the ideal secondary landing zone either. More and more businesses are designing backup strategies around minimising the amount of tape they use in their environment. It’s not in any way unusual now to see backup processes designed to keep at least all of the normal daily/weekly cycles on disk (particularly if it’s deduplication storage) and push only the long-term retention backups out to tape. (Tape is even being edged out there for many businesses, but I’ll leave that as a topic for another time.)

Much of the evolution we’ve seen in backup and recovery functionality has come from developing features around high speed random access of backups. Deduplication, highly granular recoveries, mounting from the backup system and even powering on virtual machines from backup storage all require one thing in common: disk. As we’ve come to expect that functionality in data protection products, the utility of tape for most organisations has likewise decreased significantly. Recoverability and even access-without-recovery has become a far more crucial consideration in a data protection environment than the amount of data you can fit onto a cartridge.

I have no doubt LTO-7 will win high-praise from many. But like the high speed rocket car video above, I don’t think it’s your “daily commute” data protection vehicle. It clearly has purpose, it clearly has backing, and it clearly has utility. As long as you need to go in a very straight line, don’t make any changes in direction and don’t attempt to change your speed too much.

As always, plan your data protection environment around the entire end-to-end data protection process, and the utility of that protected data.


* Oops, Mach 0.94. Thanks, Tony. (That’ll teach me to blindly copy the details from the original video description.)

Jun 112015
 

I’m back in a position where I’m having a lot of conversations with customers who are looking at infrastructure change.

What I’m finding remarkable in every single one of these conversations is the pervasiveness of Cloud considerations in data protection. I’m not just talking Spanning for your SaaS systems (though that gets people sitting up and taking notice every time), I’m talking about businesses that are looking towards Cloud to deal with something that has been a feature of datacentres for decades: tape.

I’ve mentioned CloudBoost before, and I personally think this is an exciting topic.

iStock Cloud Touch Small

An absolute ‘classic’ model now with NetWorker is to have it coupled with Data Domain systems, with backups duplicated between the Data Domain systems and removing of tape – at least for that daily and weekly backup cycle. Data Domain Extended Retention is getting a lot of interest in companies, but without a doubt there’s still been some people who look at a transition to deduplication as a phased approach: start with short-term backups going to deduplication, and keep those legacy tape libraries around for handling tape-out for monthly backups.

That certainly has appeal for businesses that want to stretch their tape investment out for the longest possible time, especially if they have long-term backups already sitting on tape.

But every time I talk to a company about deploying Data Domain for their backups, before I get to talk about CloudBoost and other functions, I’m getting asked: hey, can we say, look towards moving our long term backups to Cloud instead of tape at a later date?

You certainly can – CloudBoost is here now, and whether you’re ready to start shifting longer-term compliance style backups out to Cloud now, or in a year or two years time, it’ll be there waiting for you.

Over time (as the surveys have shown), backup to disk has increased in NetWorker environments to over 90% use. The basic assumption for years had been disk will kill tape. People say it every year. What I’m now seeing is Cloud could very well be the enabler for that final death of tape. Many businesses are entirely killing tape already thanks to deduplication: I know of many who are literally pulling their tapes back from their offsite storage vendor and ingesting them back into their Data Domains – particularly those with Extended Retention. But some businesses don’t want to keep all those long-term backups on local disk, deduplicated or not – they want a different economic scale, and they see Cloud as delivering that economy of scale.

I find it fascinating that I’m being so regularly asked by people: can we ditch tape and go to Cloud? That to me is a seismic shift on the remaining users of tape.

[Side note: Sure, you’ll find related links from me below about tape. Just because I wrote something 1-3 years ago I can’t change my opinion 🙂 ]

Apr 222013
 

Tape capacity.

It should be straight forward, but many people tend to look at only the biggest, most optimistic numbers, and assume the best. In backup, this is a mistake, and one which I’ve seen consistently made the entire time I’ve been working in the industry.

These days, when it comes to tape, LTO is the king of the hill, so consider the following table outlining native (i.e., uncompressed) format capacities:

LTO Capacity, Uncompressed

LTO Capacity, Uncompressed

This gives a baseline figure of tape capacity. But, LTO is capable of in-drive compression, which means vendors don’t quote uncompressed capacity, they quote compressed capacity (and for that matter, speed).

So the tables we see look more like the following:

LTO capacity table, compressed

LTO Capacity, compressed

See the much bigger capacity sizes? See the tiny footnote? That’s called blind optimism. In actual fact, it’s actually getting worse, not better, since LTO-6 saw a change in the compression algorithm used. That means vendors are now quoting a compressed capacity of 2.5:1 – so that LTO-6 there should in fact read 6,250 GB, not 5000 GB.

Unfortunately, a lot of people don’t look at the footnotes – or if they do, they plan, with wild optimism, to hit that 2:1 compression ratio every time. And when they don’t it’s met with consternation and surprise.

The truth is, compression ratios are a bit of a joke on tape, since they’re based entirely on what sort of data you’re backing up. I have, on multiple occasions, seen 1TB written to LTO-1 and LTO-2 tapes. Not once – not once have I worked under impression that I should get that compression ratio all the time. In case you’re wondering, these came from very large databases (at the time) that had been preallocated – so we were literally backing up very large, very empty, and thus very compressible files.

Whenever I’m sizing for a customer, I use one compression ratio, and one compression ratio only, regardless of the tape being used, and that’s 1.3:1. Sometimes it’s pessimistic, and sometimes it’s optimistic, but on the whole, it’s usually fairly accurate. So on an LTO-5 tape, I’m assuming that in all likelihood, we should see a compressed used capacity of around 1,950GB. Anything after that? Cream on top. Anything below that? Heavy data.

Don’t let yourself be fooled by tape compression ratios: they’re there for marketing purposes, not to be reflective of reality.

FUDtasia

 Backup theory, NetWorker  Comments Off on FUDtasia
Jun 252012
 

I’m in a training course for a disk-only backup product, and in the introductory material, there’s a distinct message: tape is evil; tape is slow; tape is impossible to use; tape is unreliable. You’d be justified in wondering whether tapes should come with warnings, such as the following:

Skull and Crossbones Tape

The horror! Tape = Bad. Disk = Good. Bad tape, Bad Bad Bad.

That’s the message, anyway.

The reality is far from the pseudo-science hype that marketing from disk-only backup products would have you believe. At times, I’ve seen more coherent arguments out of anti-vaccination campaigners than I have about the perils of tape. And the anti-vaccination rabble are raving lunatics.

The FUDtasia about tape could be summarised as follows:

  1. Tape is slow to recover from.
  2. Tape is unreliable.
  3. Tape is costly to duplicate.

To all three – I say one thing: pah! Lets look at each of them.

Tape is slow to recover from

This may be true for non-enterprise backup products, but the rationale espoused was unbelievable when you compare it against NetWorker. “Imagine if you have a 2TB filesystem backup, and you want to recover one single file. You might have to read through the ENTIRE 2TB to find your file.”

What. Utter. Rubbish.

Enterprise backup products – and in particular in this case, NetWorker – maintain comprehensive index information on the content of tapes. This allows them to use high speed seek operations against tapes to very quickly position themselves to the required area on tape for recovery. OK, this is not as instantaneous as initiating a read from a disk based backup, but it’s hardly akin to reading through a 2TB backup to find a single 16KB file.

Tape is unreliable

Yeah yeah yeah, blah blah blah. Tape is so terribly unreliable. That’s why it’s only been used for decades for backup, and all tape manufacturers are constantly fighting off class action lawsuits because their product is unfit for advertised use.

No. Wait.

The reality is that while tape is not guaranteed 100% reliable, neither is disk.

You can’t say that tape is unreliable when your defence is using RAID-protected storage. Take the RAID away, turn it into a stripe of disks instead, so that you can achieve the same performance as full streaming tape, and try to stand by your claim that tape is inherently less reliable than disk. I dare you.

I’ve been working in backup since 1996. I’ve never, in that entire time, had an instance where I’ve been unable to recover data because of a faulty tape, unless that tape was in itself unprotected. I.e., in the one or two times I’ve actually had a tape fail, if I’ve subsequently lost data it was because there wasn’t a duplicate tape. In all instances where data was duplicated, as it always should be, I’ve never lost data due to tape failure.

Tape is costly to duplicate

Look at your original LTO-5 tape.

Look at your duplicate LTO-5 tape.

Look at your read tape-drive.

Look at your write tape-drive.

Now look at the alternate: replicated disk storage, either in MAID format or a full second backup-to-disk environment, with all the in-datacentre costs, software maintenance costs, hardware maintenance costs, WAN/LAN link replication costs, amortised out over a 3-year period (usually the minimum investment period for a total backup-to-disk solution with replication), and seriously tell me whether it’s going to be significantly cheaper than duplicating tape.

To be sure, we’re talking more than just the physical costs; there’s also time to duplicate to be factored too, but I’ll get to that in a minute.

The short of it is that it’s a pretty brave call to make a blanket statement that, compared to a total disk-backup solution, tape is costly to duplicate. I’d argue that the majority of solutions deployed today would likely see duplicated tape as the cheaper option, even factoring in long-term tape storage – and that’s before we look at the environmental factors.

So what’s the problem?

In reality, pure backup-to-disk marketing revolves around comparisons against tape-only backups, and that’s a model that’s well and truly dying. I actually can’t recall when I last recommended a tape-only backup solution. Maybe 2002? 2003? That’s not even being bleeding edge at that point; by that time even in Australia it was, at most, leading edge.

The absolute truth of it these days is that except in the most specific of situations, the chances of it being appropriate to design a tape-only backup solution, particularly in medium to enterprise businesses, is extremely low.

Disk-only backup solutions vs tape

Don’t get me wrong – I’m all for touting the benefits offered by disk-only backup solutions, even though I still believe there will reach a point where long-term backup data may need to go out to tape.

I am tired, however, of seeing disk-only backup solutions pushed for the wrong reasons. If the reasons for pushing it amount to a journey to FUDtasia about tape, the marketing is wrong. Sure, it may stir up decision makers and get them to sign on a dotted line, but let’s be honest here – a product should stand on the merits of what it can do, not on fear about what an alternative product supposedly can’t do.

Don’t kick tape. It’s been around for decades. It’ll likely be around for decades still to come. Hand on your heart, outline the benefits of your product, without needing to resort to inadequate character assassination attempts against that draught horse of the backup industry.

Nov 142011
 

Tape LivesWhen I first started in backup and recovery, my primary backup medium was DDS-1 tapes, distributed across probably 15 servers in a computer room. Over time the number of hosts with dedicated tape drives dropped as systems were consolidated into NetWorker, and the NetWorker server got a couple of gravity-fed DDS autoloaders.

Needless to say, since that point I’ve watched lots of changes in tape technology, particularly since LTO burst onto the scene. DLT had been seemingly stagnant for years, a practical monopoly in the server space, and suffering a severe lack of innovation.

Despite years of various vendors trying to push that tape is dead, we’ll see it remain for some time yet, mainly because it still represents an incredibly economic way of storing large amounts of backup data. Sure, you can avoid using tape if you’ve got replicated backup-to-disk storage between two sites, but that either requires a substantial MAID-style footprint, or some deduplication unit – and either way it’s going to cost you a lot of money. (My personal belief is that 10TB per week backup is the minimum cut-off for consideration of deduplication technologies; and there’s a lot of businesses still backing up less than 10TB per week.)

So, here’s what I see as the key continuing trends for tape:

  1. Minimised usage for primary copy – This is a no-brainer, really. Backup to disk has taken over as the primary mechanism in a significant percentage of businesses – the “B2D2T” model, so to speak. There’s no doubt that model will continue, regardless of what that initial “to disk” looks like.
  2. Fallback/secondary copy – Tape will continue to reign supreme as the preferred fallback/secondary copy of backups for some time to come. This decade is indeed the one where some form of backup to disk will become the norm for the vast majority of businesses, but when it comes to those monthly backups that need to be kept for 7+ years, etc., tape will continue to shine.
  3. Enterprise tape is squeezed down – It used to be that there were two distinct tiers of tape: enterprise technology such as LTO (unless you believed the IBM hype that said LTO was toy-tape) and commercial/consumer tape, such as AIT, DDS, etc. That enterprise technology remained largely out of reach of the smaller businesses, but as backup to disk continues to press into the nearline/immediate recovery arena, use of enterprise tape as a primary backup and recovery source will be pushed down into smaller businesses.
  4. Commercial/consumer tape is squeezed out – Those non-enterprise tape formats, such as AIT, DDS, etc., are dead. Sony discontinued AIT to work with HP et al on DDS development, and DDS effectively died at v5. Oh, HP blather on about DDS still having a future – DDS-6/160 was released a while ago, and DDS-7/320 is supposedly in development, but these are dead duck technologies. These non-enterprise tapes were at best unreliable formats – they actually gave a lot of fodder to the “tape is dodgy” meme, and the way they’re kept on life-support by vendors unwilling to concede their time is past is frankly embarrassing.
  5. Deduplication will not migrate in any usable form to tape – Various companies blather about having “deduplication out” to tape from their products, be they target or source deduplication, but this writing of deduplicated data to tape format is fundamentally flawed and logically incompatible. Why? Deduplication requires massive amounts of random access to be able to rehydrate efficiently, but tape is sequential-access by design. So instead what is written out to tape in “deduplicated” format is entire deduplication environments, which must be read back and recovered to systems before a regular recovery can be run. Instead, they just create situations where recoveries aren’t done unless they’re hyper-critical because there’s too much effort involved.
  6. Hardware encryption will become the norm – Initially introduced in LTO-4, we’ll see continued adoption of hardware-encryption at the per-cartridge level as businesses become acutely aware of the potential damage caused by media theft. We’re already seeing various countries legislate requiring encryption of at-rest data in particular industries, and this is driving more businesses to use hardware encryption “just in case”.
  7. We’ll continue to be told tape is dead – As sure as the sun rises each day, we’ll awake almost every day to another story about the imminent death of tape.
  8. Direct iSCSI tape drives are here – Some vendors are already selling them; as the war settles between FC and IP, it’s logical that we’ll see tape drives and tape libraries appearing with 10Gbe connections. This should make connectivity simpler and quite possibly more flexible.

Other predictions

OK, the above list are the things I’m certain about. Here are a few things I’m not certain about, but I’ve been idly speculating on for some time…

  1. QR Barcodes – Personally, I think these are a joke. However, I’m betting that someone will start selling combo tape barcodes where for reach regular tape barcode you get a QR barcode so that operators and administrators can scan them from their phones, etc. They’ll be sold as allowing a whole new level of integration, automation and control, and a few businesses will get sucked into buying them. They won’t last long though. That’s assuming that QR barcodes themselves stay popular enough for this to happen.
  2. Tape RFID will get bigger – Some tape vendors are already selling tapes with RFID embedded. This’ll be a low-traction market for some time to come, but I suspect it’ll eventually become standard. I.e., this is an evolutionary rather than revolutionary progression in tape.
  3. Hardware twinning with software recognition – RAIT lost its appeal years ago, though some proprietary control systems such as ACSLS still support it. I suspect we’re going to reach a point though where hardware enabled tape twinning will be offered as a feature from those enterprise tape vendors who are being squeezed down. However, the difference will be that there’ll be APIs between the libraries/drives and the backup software to allow the backup software to see the secondary tapes as registered copies. Why? Tracking and accountability. Auditing and data tracking requirements will see to that. I don’t necessarily think that this will gain a lot of traction, but I do think it’ll become an offering again.

Why tape and dedupe just don’t mix

 Architecture, Backup theory  Comments Off on Why tape and dedupe just don’t mix
Jul 062011
 

In “Tape and dedupe: So not happening • The Register“, Chris Mellor asks:

Why haven’t more vendors followed CommVault in putting deduped data on tape? Is it technically too hard?

There’s a good reason for this – it’s pretty nutty. Whether we wish it or not, tape is designed for large scale high speed sequential access. Dedupe requires high speed random access in order to rehydrate. Some time ago I wrote a rebuttal to Curtis Preston’s overly generous appraisal of CommVault’s dedupe to tape strategy, and I still stand by every word I wrote there.

To be fair, Chris quotes someone who puts the argument very succinctly:

Steve Mackey, SpectraLogic’s sales veep for Europe and Africa, says: “The issue of dedupe is recovery. You’ve got to recover the whole tape or a set of tapes before you can recover a file. The big users of archive are looking to recover the data. Today I don’t believe dedupe on tape meets the requirements for recovery.”

Dedupe to tape is crazy. Unless we can somehow overcome the sequential access nature of tape, it will stay crazy, too. That’s why tape and dedupe generally isn’t happening.

And I’m glad that’s the case.

Jun 102011
 

How often have you heard these two memes?

“Tape Sucks”

“Tape is dead”

Oh it just goes on and on, and on and on and on. One might think that I’m having a dig at EMC and Data Domain here – particularly in light of my response on another topic’s comment thread here. And while some folk at EMC and Data Domain would technically be in my sights on this post, there’ll equally be folks from NetApp and a plethora of other vendors who think that tape is dead. So I’m not so much picking on any company, just the meme itself.

It’s the same story, over and over again. Some new whiz-bang product comes out, and people jump onto the “tape is dead” bandwagon again. Only like a really bad villain in a superhero movie, tape just won’t die. It has more lives than every cat in the world combined.

Sure, its use has evolved over time. I’m the first to admit that. When I first started in backup myself, the notion of backing up to disk was a complete anathema. After all, I had to beg, borrow, plead and promise long on-call shifts just to get a couple of extra 2GB spindles for my backup server to handle indices and temp space. Why would I have been so crazy as to backup to such an expensive medium? Tape, on the other hand, was much cheaper.

Over time disk became cheaper and had higher capacities, but it still isn’t as cheap or as high capacity as tape over the long haul. Where it exceeds tape every time is on the economics of access. You need that data back straight away? Then it needs to come back from disk, not tape. There’s no load times, etc., when it comes to disk.

And so over time as disk became cheaper, we (the industry) evolved backups to use tape as secondary, long term or high capacity storage. Backup to disk, keep the most frequently recovered backups on that medium (i.e., the most recent), and keep copies on tape. As space fills, we shift those older backups off to tape, and keep using disk for the high frequency recoveries. Disk also smooths out those pesky shoe-shining issues we see in highly varied streaming speeds to tape, too.

So it’s a win-win solution, and it’s going to stay that way for some time to come. Tape may have evolved, but it’s still better, cheaper, and more reliable for longer term storage. Curtis Preston has an excellent summary of this point here, for what it’s worth.

Will the “tape is dead” people come around to reality? Probably not. Adherents to the repeated meme don’t always give up so easily. After all, there’s even people who still believe in a flat earth.

Addendum, April 2016.

It’s funny when you come back to an article you wrote years ago and find yourself in significant disagreement. I’m preserving the content of the article above, but it’s fair to say it’s been several years now since I’ve actually agreed with it. Usage cases for tape has been shrinking regularly, and my biggest beef with tape is that it’s rarely properly managed when it is used, making the “tape is cheap” argument lopsided and inaccurate. (So this is also proof that I adapt and change my mind from time to time.)

Jun 052011
 

Within backup there are typically two tape rotation policies. One policy applies to short term backups, and the other to longer term backups. If we were looking at an ‘average’ company, this would be something like:

  • Short term backups written and rotated over a 4-6 week period of weekly fulls and daily incrementals/differentials;
  • Long term backups written as monthly fulls, and then stored for 7-10 years.

Everyone seems to easily understand the short term backup tape rotation policy, which resembles the following:

Short term tape rotation lifecycle

However, there’s some confusion over how the long term tape rotation policy works. In particular, most organisations seem to think it looks like the following:

Perceived Longterm Tape Lifecycle

To be certain, your long term tape rotation policy could look like the above, but in doing so you may as well state “our long term rotation policy is dependent on luck”.

As we know, backup and data protection is not about trusting to luck, and so a long term tape rotation policy should in fact resemble the following:

Actual longterm tape lifecycle

As you can see, there’s a significant difference between the perceived and actual long term tape lifecycle, and this centres around ensuring long-term recoverability of the data that’s been backed up. In particular, this involves:

  • Periodically pulling older tapes out of storage;
  • Testing the tapes;
  • If there is a concern, or the tape has reached a maximum determined age:
    • Migrate the required data written to the old tape onto new media;
    • Secure erasure or destruction of the old media.
  • Otherwise, returning the tape to storage.

If your organisation’s long-term tape lifecycle looks more like the former, rather than the latter diagram, then you need to look at adjusting it as soon as possible. Failure to do so can readily result in the unpleasant situation of recoveries failing due to significantly aged media that are past their ‘best-by’ date.

May 052011
 

(No, not micromanuals – official EMC Technical Notes for NetWorker.)

EMC has been working hard at making very useful technical documents available for NetWorker. There’s two in particular that were  released in April/May 2011 that offer brilliant information – these are:

  • NetWorker Probe Overview and Troubleshooting Technical Note – At last. This is worth downloading just for one paragraph in that document, which clearly explains the exact criteria in which clients are probed, new savesets are selected for initiation, etc. You must read this document.
  • Configuring Tape Devices for EMC NetWorker Technical Note – A really useful overview of getting tape devices configured in NetWorker. While normally this is straight forward, when there are problems, they can be frustrating to work through. This provides some really comprehensive information regarding the process, including some per-OS details that aren’t to be missed.

Do yourself a favour – make sure to download the above two documents, and keep an eye on the NetWorker Technical Notes – they’re proving to be of excellent value.