May 232017
 

I’m going to keep this one short and sweet. In Cloud Boost vs Cloud Tier I go through a few examples of where and when you might consider using Cloud Boost instead of Cloud Tier.

One interesting thing I’m noticing of late is a variety of people talking about “VTL in the Cloud”.

BigStock Exhausted

I want to be perfectly blunt here: if your vendor is talking to you about “VTL in the Cloud”, they’re talking to you about transferring your workloads rather than transforming your workloads. When moving to the Cloud, about the worst thing you can do is lift and shift. Even in Infrastructure as a Service (IaaS), you need to closely consider what you’re doing to ensure you minimise the cost of running services in the Cloud.

Is your vendor talking to you about how they can run VTL in the Cloud? That’s old hat. It means they’ve lost the capacity to innovate – or at least, lost interest in it. They’re not talking to you about a modern approach, but just repeating old ways in new locations.

Is that really the best that can be done?

In a coming blog article I’ll talk about the criticality of ensuring your architecture is streamlined for running in the Cloud; in the meantime I just want to make a simple point: talking about VTL in the Cloud isn’t a “modern” discussion – in fact, it’s quite the opposite.

Oct 102013
 

If you’re looking at deploying Data Domain with NetWorker, but are coming from a physical tape or even an alternate virtual tape environment, you may look at the systems and think: “Should I use VTL or Boost?”

These days the answer is actually fairly simple: use Boost as much as possible.

If this were a True/False exam, that’d be the end of the article, but I think if you’re not sure what to do, you’ll want to see my working – how I got to that point.

The reasons are multiple, and I’ll break them down as follows:

  • Concurrency;
  • Cloning Integration;
  • Recoverability;
  • Reporting.

Once I’ve gone through them, you’ll see that each of those items on their own would likely represent a sufficient reason to choose Boost over VTL.

Concurrency

As I’ve said in the past, and as is well known, a virtual tape library will emulate all the features of a tape library, good and bad. This is perhaps most obvious when it comes to concurrency – that is, simultaneous read and write operations.

Let’s consider the difference between a VTL and a Boost configuration. We’ll start with the Boost configuration. We won’t worry too much about the actual capacity of the units, but let’s assume they’re the same.

From a Boost perspective, you might present this to your backup server as 4 Data Domain Boost folders. For standard Boost devices, this will see target sessions set to 4 per folder and max sessions set to 32. That’s a total in a default configuration of 128 streams going to the Data Domain. However, NetWorker doesn’t count recovery streams in that session count, so that’s a total of 128 maximum write streams, and (theoretically) as many recovery streams as you wanted to run.

If we consider it from a VTL perspective, it gets a little tricky. The primary reason for that is that the way NetWorker multiplexes for tape is fundamentally incompatible with deduplication, regardless of whether you’re writing to a Data Domain, Quantum, HP or any other deduplication device. That means each virtual tape drive you create has to be limited to a max sessions count of … one.

That means, if you wanted to write 32 save sets simultaneously to the Data Domain VTL you’d have to define 32 virtual tape drives. Or if you wanted to match the max sessions defined by the Boost variant, you’d be defining 128 virtual tape drives.

With virtual tape drives of course, emulating the best and the worst of tape, each of those virtual tape drives can either read or write, but not both (at least not concurrently). So if you wanted to ‘guarantee’ being able to run a recovery at the same time as a backup, you’d need to define even more virtual tape drives, and make them read-only.

But … you’re not guaranteeing that you can simultaneously backup and recover at any time whatsoever. If you’re using Boost devices, you are – a backup can be running to a Boost device and you can kick off a recovery from the same device and NetWorker will run them both concurrently. In fact, NetWorker these days will run multiple backups, multiple restores, multiple clones and multiple stage operations all concurrently to/from the same Boost device. With a VTL, if you need to recover or clone from a virtual tape that is currently being written to, you’ll need to wait until the tape fills or the backup finishes. Even limiting virtual tapes to 50GB or 100GB still creates a pause period that cannot be avoided.

But But … even if you decide all of those limitations are OK, are you prepared to spend the money required to license all of those virtual tape drives that are required? Ahah, you might think – “with a VTL I get an “unlimited” autochanger license, and that covers everything” … not quite right. You see, the Unlimited autochanger license covers you for an unlimited number of slots in your VTL, but NetWorker licenses drive count by the server license + the number of storage node licenses. For a Power Edition server you’ll be automatically licensed for 32 devices on the server, and for any form of storage node, you get 16 devices per storage node. For a Network Edition server, the server device count is just 16. Now, you can stack on storage node licenses without actually creating storage nodes and increase the device count, but let’s go back to that comparison again for a moment … 128 virtual tape drives.

If you’re using NetWorker, Network Edition, that’s:

  • First 16 tape drives – Server License
  • Next 16 tape drives – Storage Node License 1
  • Next 16 tape drives – Storage Node License 2
  • Next 16 tape drives – Storage Node License 3
  • Next 16 tape drives – Storage Node License 4
  • Next 16 tape drives – Storage Node License 5
  • Next 16 tape drives – Storage Node License 6
  • Next 16 tape drives – Storage Node License 7

That’s a lot of storage node licenses, just to license 128 tape drives. Presumably in a VTL configuration you’re either going to have a second VTL to replicate to, or a physical tape library, and you’ll need storage node device counts to cover those devices too.

Phew! If that’s not enough of a reason to go down the path of Boost, then…

Cloning Integration

Let’s say you’ve got two Data Domain systems at different sites, and you want to replicate the data between them to protect against site failure, etc.

If you’re using VTL, you effectively have two options:

  • Use NetWorker to do cloning between the two Data Domains;
  • Use Data Domain replication on the VTL storage folder.

Both of those options, to use the technical term, have considerable “suck” problems.

For the first option – NetWorker cloning – you have to keep in mind that NetWorker doesn’t see the VTL as a deduplication device. Therefore, NetWorker’s going to execute a complete data read from each virtual tape it needs to clone from, and the Data Domain will faithfully rehydrate all the data, allowing NetWorker to send it across the network to be deduplicated and stored on the other host.

However, if you use the second option, then the clone is a perfect volume replica, and if there’s one thing NetWorker doesn’t like, it’s seeing the same volume, with the same volume label and volume ID, in more than one location. Basically you have to keep the media you clone this way out of the virtual library on the remote site. If you want to use it for a recovery, you have to first make the media invisible to NetWorker on the primary site (e.g., via a library export operation), before you can import it within NetWorker. That becomes a tedious task of:

  • Primary site: Export the media to the VTL CAP
  • Secondary site: Use the VTL management software to move the media to the VTL CAP
  • Secondary site: Import the media from the VTL CAP
  • Secondary site: Use media
  • Secondary site: Export the media to the VTL CAP
  • Secondary site: Use the VTL management software to remove the media from the VTL CAP
  • Primary site: Import the media from the VTL CAP

If you compare that to NetWorker cloning controlled Boost replication, you’ll see there’s no comparison at all. NetWorker initiates the clone operation, which triggers, at the back end, a replication job between the two Data Domains. The replica copy on the secondary Data Domain is a fully registered NetWorker copy/clone and accessible to NetWorker while the primary copy is still visible. What’s more, with NetWorker 8.1, you can turn on immediate cloning, whereby as each saveset in a group is finished, NetWorker initiates a Boost Clone of that saveset, shrinking your clone windows considerably.

Phew! If that’s not enough of a reason to go down the path of Boost, then…

Recoverability

The first obvious comment about recoverability is a repeat of the point made in Concurrency. Boost will allow you to run recover sessions at the same time as you’re running backup sessions to the same volumes, and therefore you’re (pun intended) Boosting recoverability within your environment.

But it’s more than that. Some of the advanced recoverability features in NetWorker require disk backup. Virtual tape, for all its use in overcoming architectural problems with physical tape, emulates tape closely enough that NetWorker considers it to be sequential access, and a sequential access device just doesn’t give the flexibility required for the advanced recovery features.

What advanced recovery features am I referring to?

  • Granular recovery from NMM (Exchange, SharePoint, etc.) – This functionality allows you to ‘mount’ a copy of the backup within the environment without actually completing a full restore, and then pull back the individual items you want. If you’re backing up to VTL, you don’t have this option available to you.
  • Block Level Backup in Windows 2012 – Recovery from block level backup similarly does some trickery regarding pseudo-mounting the filesystem for recovery from backup.
  • Instant-On VMware Recovery – OK, this isn’t available in NetWorker, but given its introduction in Avamar, you’d have to think it’s a highly likely contender for availability in NetWorker, and you can bet your next cup of coffee that it’s not going to be available from physical or virtual tape.

Phew! If that’s not enough reason to go down the path of Boost, then…

Reporting

What can I say? I’m in love with the reporting integration between Boost and NetWorker. With a Boost device integrated as a NetWorker target, NetWorker will provide you per client, per client filesystem, and per backup per client filesystem statistics on deduplication:

DD Boost drill down report 1

DD Boost drill down report 2

Of course, you also get overall statistics, such as available capacity on the Data Domain – but in a deduplication environment, being able to drill down to that level of detail on the deduplication statistics on clients and filesystems is an absolute boon.

If you’re using tape, the most you can do is get a report on the Data Domain of the deduplication ratios for each virtual tape. That’s not really useful when it comes to monitoring deduplication performance within the environment.

…but, Fibre vs Ethernet

It used to be that a compelling reason to use VTL over Boost was if you had a site where there’d been heavy investment in the fibre-channel network, but less so in the ethernet network. I.e., you might have gigabit networking for IP, but 8 gigabit for fibre-channel.

However, with NetWorker 8.1, even that reason has been addressed – from 8.1 onwards, NetWorker supports Boost devices over fibre-channel.

Boost your Backups

So there you have it – 4 sets of reasons as to why you’re better off using Boost instead of VTL with Data Domain – and a fifth bonus reason thrown in as well.

So go forth and Boost!

Dec 232010
 

The holiday season is upon many of us – whether you celebrate xmas or christmas, or just the new year according to the Julian calendar, we’re approaching that point where things start to ease off for a lot of people and we spend more time with our families and friends.

Before I wrap up for the year, I wanted to spend a few minutes reintroducing some of the most popular topics of the year on the blog – the top ten articles based on directly linked accesses. Going in reverse order, they are:

  • Number 10 – “Why I’d choose NetWorker over NetBackup every time“. I was basically called an idiot by someone in the storage community for writing this, but the fact remains for me that any backup product that fails to support backup dependencies is not one that I would personally choose. Given that a top search that leads people to the blog is of the kind, “netbackup vs networker” or “networker vs netbackup”, clearly people are out there comparing the two products, and I stand by my support of the primacy of backup dependency tracking.
  • Number 9 – “A tale of 4 vendors“. A couple of months ago I attended SNIA’s first Australian storage blogger event, touring EMC, IBM, HDS and NetApp. Initially I’d planned to blog a fairly literal dump of the information I jotted down during the event, but I realised instead I was more drawn to the total solution stories being told by the 4 vendors.
  • Number 8 – “NetWorker 7.5.2 – What’s it got?“. NetWorker 7.5 represented a big upgrade mark for a lot of sites, particularly those that wanted to jump the v7.3 and v7.4 release trees. I still get a lot of searches coming to the blog based on NetWorker 7.5 features and upgrades.
  • Number 7 – “Using NetWorker Client with Opensolaris“. This was written by guest blogger Ronny Egner, and has seen more interest over the last few months as Oracle’s acquisition continues to grind down paid Sun customers. If you’re interested in writing guest blog pieces for the NetWorker Blog in 2011, let me know!
  • Number 6 – “Basics – Fixing ‘NSR peer information’ errors“. I’ve said it before, and I’ll say it again: there is no valid reason why the resolution for this hasn’t been built into NMC!
  • Number 5 – “NetWorker and linuxvtl, Redux“. The open source LinuxVTL project continues to grow and develop. While it’s not suited for production environments, LinuxVTL is certainly a handy VTL to plug into a NetWorker/Linux system for testing purposes. I know – I use it almost every single day.
  • Number 4 and Number 3 – “NetWorker 7.6 SP1“. Interest in NetWorker 7.6 SP1 has been huge, and I had two blog postings about it – a preview posting based on publicly shared information from EMC, and the actual post-release article that covered some key features more in-depth.
  • Number 2 – “Carry a Jukebox with you (if you’re using Linux)“. The first article I wrote about the LinuxVTL project.
  • Number 1 – “micromanual: NetWorker Power User Guide to nsradmin“. The Power User guide to nsradmin has been downloaded well over a thousand times. I’ve been a fan of nsradmin ever since I started using NetWorker and had to administer a few NetWorker servers over extremely slow links (think dial-up speeds). It’s been very gratifying to be able to introduce so many people to such a useful and powerful tool.

Personally this year has been a pretty big one for me. Probably the biggest single event was that my partner and I made the decision to move from central coast NSW to Melbourne, Victoria during the year. We haven’t moved yet; it’s due for June 2011, but it’s going to necessitate a lot of action and work on our part to get there. It’ll be well worth the effort though, and I’ve already reached that odd point where I no longer think of the place I’m living as “home”. The reasons that led us to that decision are covered on my personal blog here. Continuing the personal front, I was extremely pleased to be able to say goodbye to the mobile “netwont” that is Vodafone in Australia. I’ve been using my personal blog to talk about a lot of varied topics running from internet censorship to invasive information requests to more mundane things, such as what makes a good consultant.

Technically I think the coming few years are going to be fascinating. Deduplication has only just started to make a splash; I think it’ll be a while before it becomes as pervasive as say, plain old disk backup, but it will have a continued and growing effect in the enterprise backup market. I predict that another bevy of dopey analysts will insist that tape is dead, just like they have every year for the last 2 decades, and at the end of the year I predict the majority of companies they interface with will still be using tape in some form or another. However, the use of tape will continue to evolve in the marketplace; as nearline disk storage becomes more regular and cheaper for backup solutions, we’ll see tape continue to be pushed out to longer term retention systems and safety nets – i.e., tape is certainly sliding away from being the primary source for recoveries in an enterprise backup environment.

One last thing – I want to thank the readers of this blog. To those people who subscribe to the mailing list, and those who subscribe to the RSS feed, to those who have the site bookmarked and to those who just randomly stumble across the site – I hope in each case you’re finding something useful, and I’m grateful for your readership.

Happy holidays to those of you celebrating or relaxing over the coming weeks, and peaceful times to those working through.

appliances vs Appliances

 Architecture, Backup theory, Support  Comments Off on appliances vs Appliances
Aug 182010
 

We all have appliances, right? Teapots and toasters and microwaves and automatic coffee machines, etc. They’re all appliances. So are clock radios, electric razors, heaters and fans.

They’re appliances.

VTLs, SANs and NASs are not appliances, despite what any vendor would try to tell you. As soon as you’ve got an OS + software layer, you’re moving beyond “appliance” into “black box”. Or maybe we’re talking the difference between an appliance and an Appliance. If a vendor wants to tell you otherwise, they’re not telling you the whole story.

There’s a simple test on whether you’re being sold an appliance, or an Appliance – a simple yes/no question:

Is there a training course for the unit or an instruction manual with more than 1 page of instructions per language?

If the answer is “no”, then congratulations, you’ve got an appliance; if the answer is yes, then despite whatever your vendor wants to tell you, you’ve got an Appliance.

Now, there’s nothing wrong with having an Appliance within your organisation, and in fact I’d suggest that frequently they add a lot of value. VTLs, SANs and NASs, to use the example I previously provided, are all capable of greatly extending the storage and data protection options within your environment and should of course be considered in many architectures.

Knowing that they’re Appliances rather than appliances though means that you can treat them appropriately. I personally don’t care about backing up my toaster, or keeping a close eye on the logs from my microwave. As the appliance complexity increases, I pay more attention – so for instance the most critical appliance in my home is arguably the automatic espresso machine, and since it has blinking lights that can tell me whether I’m able to get a cup of coffee from it or not, I pay attention to it.

Extending this process, when you move from having appliances in your organisation to having Appliances, it’s critical that they are treated as full blown systems that require the same level of support, administration and consideration when it comes to problem resolution. Or another way to consider it, from a support perspective – if there’s an error happening in your environment, don’t ignore the “black boxes” when it comes to problem diagnosis. This means being aware of at least the following:

  • How to view basic status;
  • How to extract logs;
  • Any caveats to reading logs (e.g., are they time/date stamped using a different GMT offset to your environment?);
  • How to review the logs;
  • How to escalate requests to the Appliance vendor.

Once you’ve been working with Appliances for a while, all of these start to come naturally. The big trick for beginners in the Appliance realm though is to ignore the “black box” you’ve been sold and instead be aware of the components and how to access the diagnostic information for the unit. If you can’t, you’ve created a “black hole” – and that’s not something you’ll get a lot of satisfaction from.

Apr 262010
 

As I mentioned in an earlier post, EMC have announced on their community forum that there are some major changes on the way for ADV_FILE devices. In this post, I want to outline in a little more detail why these changes are important.

Volume selection criteria

One of the easiest changes to describe is the new volume selection criteria that will be applied. Currently regardless of whether it is backing up to tape, virtual tape, or ADV_FILE disk devices, NetWorker uses the same volume selection algorithm – whenever there are multiple volumes that could be chosen, it always picks volumes to write to in order of labeled date, from oldest to most recent. For tapes (and even virtual tapes), this selection criteria makes perfect sense. For disk backup units though, it’s seen administrators constantly “fighting” NetWorker to reclaim space from disk backup volumes in that same labeling order.

If we look at say, four disk backup units, with the used capacity shown in red, this means that NetWorker currently writes to volumes in the following order:

Current volume selection criteriaSo it doesn’t matter that the first volume picked also has the highest used capacity – in actual fact, the entire selection criteria is geared around trying to fill volumes in sequence. Again, that works wonderfully for tapes, but it’s terrible when it comes to ADV_FILE devices.

The new selection criteria for ADV_FILE devices, according to EMC, is going to look like the following:

Improved volume selection criteriaSo, recognising that it’s sub-optimal to fill disk backup units, NetWorker will instead write to volumes in order of least used capacity. This change alone will remove a lot of the day to day management headaches of ADV_FILE devices from backup administrators.

Dealing with full volumes

The next major change coming is dealing with full volumes – or alternatively, you may wish to think of it as dealing with savesets whose size exceeds that of the available space on a disk backup unit.

Currently if a disk backup unit fills during the backup process, whatever saveset being written to that unit just stays right there, hung, waiting for NetWorker staging to kick in and free space before it will continue writing. This resembles the following:

Dealing with full volumesAs every NetWorker administrator who has worked with ADV_FILE devices will tell you, the above process is extremely irritating as well as extremely disruptive. Further, this only works in situations where you’re not writing one huge saveset that literally exceeds the entire formatted capacity of your disk backup unit. So in short, if you’ve previously wanted to backup a 6TB saveset, you’ve had to have disk backup units that were more than 6TB in size, even if you would naturally prefer to have a larger number of 2TB disk backup units. (In fact, the general practice has been when backing up to ADV_FILE devices to ensure that every volume can fit at least two of your largest savesets on it, plus another 10%, if you’re using the devices for anything other than just intermediate-staging.)

Thankfully the coming change will see what we’ve been wanting in ADV_FILE devices for a long time – the ability for a saveset to just span from one volume it has filled across to another. This means you’ll get backups like:

Disk backup unit spanningThis will avoid situations where the backup process is effectively halted for the duration of staging operations, and it will allow for disk backup units that are smaller than the size of the largest savesets to be backed up. This in turn will allow backup administrators to very easily schedule in disk defragmentation (or reformatting) operations on those filesystems that suffer performance degradation over time from the mass write/read/delete operations seen by ADV_FILE devices.

Other changes

The other key changes outlined by EMC on the community forum are:

  • Change of target sessions:
    • Disk backup units currently have a default target parallelism of 4, and a maximum target parallelism setting of 512. These will be reduced to 1 and 32 respectively (and of course can be changed by the administrator as required), so as to better enforce round-robining of capacity usage across all disk backup units. This is something most administrators will end up doing by default, but it’s a welcome change for new installs.
  • Full thresholds:
    • The ability to define a %full threshold at which point NetWorker will cease writing to one disk backup unit and start writing to another. Some question whether this is useful, but I can see the edge of a couple of different usage scenarios. First, as a way of allowing different pools to share the same filesystem, making better use of capacity, and secondly, in situations where a disk backup unit can’t be a dedicated filesystem.

When we add all these changes up, ADV_FILE type devices are going to be back in a position where they’ll give VTLs a run for their money on cost vs features. (With the possible exception being the relative ease of device sharing under VTLs compared to the very manual process of SAN/NAS sharing of ADV_FILE devices.)

Apr 202010
 

I had been aware for a while from an NDA conversation that these changes were on the way, but of course have not been able to discuss them.

However, with EMC opening up discussion on the EMC Community Forum – i.e., out in public, I now feel that I can at least discuss how excited I am about the coming ADV_FILE changes.

For some time I’ve railed against architectural failings in ADV_FILE devices, and explained why those failings have led me to advocate the use of VTLs over ADV_FILE devices. As announced on this thread in the forums by Paul Meighan, many of those architectural limitations are soon going to be relegated to the software evolutionary junkpile. In particular, EMC have stated in the forum article that the following changes are on the way:

  1. Volume selection criteria becomes intelligent. NetWorker currently uses the same volume selection criteria for disk backup as it does for tapes. This means that the oldest labelled volume with free space on it always gets picked first, and subsequent volumes get picked following this strategy. This has meant that backup administrators have continually fought a running battle to keep the original disk backup units staged more regularly than others. Instead, NetWorker will now pick ADV_FILE volumes in order of maximum capacity free, which will free a lot of backup administrators from the overall pain of day to day capacity management.
  2. Savesets can span advanced file type devices. Finally, the gloves are off! With the ability to have savesets cease writing to one disk backup unit and move over to another, NetWorker ADV_FILE devices will be able to serve as a scaleable and transparent storage pool, backups will flow from one device to another in exactly the way they always should have.
  3. Session changes. To reflect round-robining best practices, the default target sessions for disk backup units will drop from 4 to 1.

When we add together the first two changes, we get powerful enhancements in NetWorker’s disk backup functionality. Do other products already do this? Yes, I’m not suggesting that NetWorker is the first to this, but it’s fantastic to finally see this functionality coming into play.

Until this point, NetWorker has suffered the continual challenge with disk backup of constant administrative overheads and trying to plan in advance the best possible space allocation technique for disk backup filesystems. Once these changes come into play: no more challenge on either of these fronts.

Folks, this is big. Yes, these changes should have come a long time ago, but I’m not going to let the delay get in the way of being damn grateful that they’re finally coming.

Apr 142010
 

In the previous article, I covered the first five of ten reasons why tape is still important. Now, let’s consider the other five reasons.

6. Tape is greener for storage

Offline storage of tape is cheap, from an environmental perspective. Depending on your locality, you may not even have to keep the storage area air-conditioned.

Disk arrays and replicated backup server clusters don’t really have the notion of offline options. Even if they’re using MAID, the power consumption for the psuedo-offline part of the storage will be higher than that for unpowered, inactive tape.

7. Replicated tape is cheaper than replicated disk

And by “replicated tape” I mean cloning. Having clones of your tapes is a cheaper option than running a system with full replication. Full replication requires similar hardware configurations on both sides of the replica; cloning a tape requires – another tape. That’s a lot cheaper, before you even look at any link costs.

8. Done right, tapes are the best form of thin provisioning you’ll get

Thin provisioning is big, since it’s an inherent part of the “cloud” meme at the moment. Time your purchases correctly and tape will be the best form of thin provisioning within your enterprise environment.

9. Tape is more fault tolerant than an array

Oh, I know you’ve got the chuckles now, and you think I’ve gone nuts. Arrays are highly fault tolerant – looking at RAID alone, if your disk backup environment is a suite of RAID-6 LUNs, then for each LUN you can withstand two disk failures. But let’s look at longer term backups – those files that you’ve backed up multiple times. Some would argue that these shouldn’t be backed up multiple times, but that’s an argument that doesn’t translate well down into the smaller enterprises and corporates. Sure, big and rich companies can afford deduplicated archiving solutions, but smaller companies have to make do with the traditional weekly fulls kept for 5 or 6 weeks, and monthly fulls kept for anywhere between 1 and 10 years will have the luxury of a potentially large number of copies of any individual file. The net result? Perhaps as much as 50% of longer term recoveries will be extremely fault tolerant – if the March tape fails, go back to the February tape, or the January tape, or the December tape, etc. This isn’t something you really want to rely on, but it’s always worth keeping in mind regardless.

10. Tape is ideally suited for lesser RTO/RPOs

Sure if you have RTOs and RPOs that demand near instant recovery with minimum data loss, you’re going to need disk. But when we look at the cheapness of tape, and practically all of the other items we’ve discussed, the cost of deploying a disk backup system to meet non-urgent RPOs and RTOs seems at best a case of severe overkill.

Apr 122010
 

Various companies will spin you their “tape is dead” story, and I’m the first to admit that the use pattern for tapes is evolving, but to anyone who claims that tape has lost its relevance, I’ll argue otherwise.

This is part 1 of a 2 part article, and we’ll cover reasons 1 through 5 here.

1. Tape is cheap

Comparatively tape is still significantly cheaper than disk. In AUD, from end-resellers you can buy individual LTO-4 cartridges (800GB native) for $50. Even at a discount price, in Australia you’ll still pay around $90 to $110 for a 1TB drive (the closest comparison).

2. Tape is offline

If your backup server is using traditional backup to disk and is infected by a destructive virus or trojan, you can lose days, weeks, months or perhaps even years of backups.

No software, no matter how destructive (unless we’re talking Skynet levels of destruction) is going to be able to reach out from your infected computers and destroy media that’s sitting outside of your tape libraries. It’s just not going to happen. There’s a tonne of more likely scenarios that you’d need to worry about first before getting down to this scenario.

3. You can run a tape out of a burning building

Say you’ve bought the “tape is dead” argument, and all your backups are in either on a VTL, a standard array for disk backup, or some multi-cluster centralised storage system (e.g., a RAIN as per Avamar). But you’re a small site comparatively, and so you have to buy the replication system in a future budget.

Then your datacentre catches on fire. Good luck with grabbing your array or cluster of backup servers and running out the building with them. On the other hand, that nearby company that also caught fire but stuck with tape had their administrator snatch last night’s backup tape out of the library and run out of their building.

Sure, the follow up response is that you should have replicated VTLs or replicated arrays or replicated dedupe clusters, etc., but it’s not uncommon to see smaller sites buy into the “tape is dead” solution and not do any replication – planning to get budget for it in say, the second year or deployment, or when that colocation datacentre goes ahead in the “sometime later” timeframe.

4. Tapes have better offline bandwidth

Need to get a considerable amount of data (e.g., hundreds of terabytes, maybe even petabytes) from one location to other? Unless you can stream data across your links at hundreds of megabytes per second (still a while away for any reasonable corporate entity), you’re going to have better luck with sending your data via tapes rather than disks. Lighter and more compact than disks, let alone disk arrays, your capacity per cubic metre is going to be considerably higher with tape than it is with disk.

Think I’m joking? Let’s look at the numbers. Say you’ve got a cubic metre of shipping space available, let’s see which option – tape or disk – gives you the most capacity.

An LTO cartridge is 10.2cm x 10.54cm x 2.15cm. That means in 100cm x 100cm x 100cm, you can fit 9 x 9 x 46 cartridges, which comes to a grand total of 3,726 units of media. Using LTO-5 for our calculations, that’s a native capacity of 5,589 TB per cubic metre. Of course, that’s without squeezing additional media in the remaining space, but I’ll leave that up to people with more math skills than I.

A typical 3.5″ internal form-factor drive (using the 1.5TB Seagate Barracuda drive for comparison) is 10.2cm x 14.7cm x 2.6cm. In a cubic metre, you’ll fit 9 x 6 x 38 disk drives, or 2,052 drives. Using 2TB drives (currently the highest capacity), you’ll get 4,104 TB per cubic metre.

So on the TB per cubic metre front, tape wins by almost 1,500 TB.

Looking at weight – we start to see some big differences here too. The average LTO cartridge (using LTO-4 from IBM as our “average”) is 200 grams. A cubic metre of them will be 745.2 KG. That Seagate Barracuda I quoted before though weighs in at 920 grams – so for a cubic metre of disk drive capacity, you’re looking at 1,887.4 KG. There’s a tonne of difference there!

Tape wins on that sort of high capacity offline data transfer without a doubt.

5. Storage capacity of a tape system is not limited by physical datacentre footprint

If you’ve got a disk array, there’s an absolute limit to how much data you can store in it that (as much as anything) is determined by its physical footprint. If you fill it and need to add more storage, you need to expand its footprint.

Tape? Remove some cartridges, put some more in. Your offline physical footprint will grow of course – but if we’re talking datacentres, we’re talking a real, tangible cost per cubic metre of space. Your tape library of course will occupy a certain amount of space, but its storage capabilities are practically limitless regardless of its size, since all you have to do is pull full media out and replace it with empty media. Offline storage space will usually cost up to an order of magnitude less than datacentre space, so disk arrays just can’t keep up on this front.

Reasons 6 through 10 will be published soon.

Aside – Top Stories for February

 Aside, NetWorker  Comments Off on Aside – Top Stories for February
Mar 012010
 

Close enough together that I have to declare them a tie, the top stories for February were:

It’s fair to say that Carry a jukebox with you is remaining a big hit all the time – a bit like the “NSR peer information” story, and so February will be the last month that it gets included in consideration for top articles.

Towards the end of the month, with the release of NetWorker 7.5 SP2, there was quite a lot of interest in the articles “NetWorker 7.5.2 released” and “NetWorker 7.5.2 – What’s it got?“. Obviously if you’ve got Windows 2008 or Windows 7 clients that you need to backup, 7.5 SP2 is almost a no-brainer – you’ll really need to be using it. So far, based on my testing on Linux, 7.5 SP2 is looking fairly good for that platform too. As always, everyone should read the release notes before deciding whether to upgrade their environments.

Basics – VTLs and default media sizes

 Basics, NetWorker  Comments Off on Basics – VTLs and default media sizes
Feb 272010
 

While this is pertinent to all versions of NetWorker, it particularly seems relevant mentioning now, since as of 7.5.2, we’re now seeing revised messaging from NetWorker when a tape becomes prematurely full. These new messages now state:

nsrd media notice: LTO Ultrium-4 tape 800814L4 used 2039 MB of 800 GB capacity
nsrd media notice: NetWorker media: (Warning) 800814L4 marked full prematurely.
  Verify possible error on the device /dev/nst4, advertised capacity is 800 GB
  marked full at 2039 MB

Now, it’s worth noting here that normally if you get a tape fill up so soon that probably means there is an issue, and this version of the message, while only subtly different, is certainly more informative and that is a good thing. When we consider VTLs however, it’s a different story. In a virtual tape library, we normally want to use much smaller media sizes than the drive type we’re configured for. That way you’re writing virtual volumes that are 50GB or 100GB rather than 800GB. In my case referring to the above, my lab VTL uses virtual media sizes of 1GB (with compression).

So, how do you go about this? Well, it’s easiest to accomplish when you first setup the environment. You need to change the “Volume Default Capacity” of each virtual device to suit the allocated media sizes. To do this, in NMC turn on View->Diagnostic Mode, then when viewing device properties, enter the appropriate size in gigabytes (followed by “G” or “GB”) in the “Volume default capacity” field of the Configuration tab, shown below:

Changing default volume size

Now, if you can do that on your VTL devices before you start labelling volumes, you’re done and dusted. However, if you’ve previously labelled your media, you either have to relabel the currently blank virtual media or wait until NetWorker gets around to recycling the currently used media.

You can query mminfo to see what the default capacity is registered at – e.g.,

[root@tara ~]# mminfo -m
state volume                  written  (%)  expires     read mounts capacity
800801L4                2254 MB full 02/26/2011   0 KB     5    800 GB
800802L4                   0 KB   0%     undef    0 KB     5   1000 MB
800804L4                   0 KB   0%     undef    0 KB     5   1000 MB
800805L4                   0 KB   0%     undef    0 KB     3    800 GB

Now, what effect does this have to how much you can write to the volumes? The short answer is none. All you’re doing is adjusting the default capacity assigned to new volumes that are labelled in these (virtual) tape drives – and we can see what happens when NetWorker breaches the default volume capacity all the time in relation to physical tape – it just keeps writing until it hits end of physical tape. Nothing more, nothing less. So this means when you fill up your virtual media, NetWorker doesn’t complain at all:

nsrd media notice: LTO Ultrium-4 tape 800802L4 on /dev/nst3 is full
nsrd media notice: LTO Ultrium-4 tape 800802L4 used 2793 MB of 1000 MB capacity
nsrd media info: WORM capable for device /dev/nst3 has been set

Is this something you must to do? Well, no, not technically. However, remembering that I advocate a zero error policy, the above is something I’d definitely strongly recommend for virtual devices. Doing so will eliminate what would otherwise be false errors on the virtual tapes within the NetWorker daemon logs. That means if you have to search for media issues, or refer your daemon logs to your support provider for analysis, they won’t be seeing bunches of “tape filled prematurely” issues.

%d bloggers like this: