Data Domain: VTL vs Boost

If you’re looking at deploying Data Domain with NetWorker, but are coming from a physical tape or even an alternate virtual tape environment, you may look at the systems and think: “Should I use VTL or Boost?”

These days the answer is actually fairly simple: use Boost as much as possible.

If this were a True/False exam, that’d be the end of the article, but I think if you’re not sure what to do, you’ll want to see my working – how I got to that point.

The reasons are multiple, and I’ll break them down as follows:

  • Concurrency;
  • Cloning Integration;
  • Recoverability;
  • Reporting.

Once I’ve gone through them, you’ll see that each of those items on their own would likely represent a sufficient reason to choose Boost over VTL.

Concurrency

As I’ve said in the past, and as is well known, a virtual tape library will emulate all the features of a tape library, good and bad. This is perhaps most obvious when it comes to concurrency – that is, simultaneous read and write operations.

Let’s consider the difference between a VTL and a Boost configuration. We’ll start with the Boost configuration. We won’t worry too much about the actual capacity of the units, but let’s assume they’re the same.

From a Boost perspective, you might present this to your backup server as 4 Data Domain Boost folders. For standard Boost devices, this will see target sessions set to 4 per folder and max sessions set to 32. That’s a total in a default configuration of 128 streams going to the Data Domain. However, NetWorker doesn’t count recovery streams in that session count, so that’s a total of 128 maximum write streams, and (theoretically) as many recovery streams as you wanted to run.

If we consider it from a VTL perspective, it gets a little tricky. The primary reason for that is that the way NetWorker multiplexes for tape is fundamentally incompatible with deduplication, regardless of whether you’re writing to a Data Domain, Quantum, HP or any other deduplication device. That means each virtual tape drive you create has to be limited to a max sessions count of … one.

That means, if you wanted to write 32 save sets simultaneously to the Data Domain VTL you’d have to define 32 virtual tape drives. Or if you wanted to match the max sessions defined by the Boost variant, you’d be defining 128 virtual tape drives.

With virtual tape drives of course, emulating the best and the worst of tape, each of those virtual tape drives can either read or write, but not both (at least not concurrently). So if you wanted to ‘guarantee’ being able to run a recovery at the same time as a backup, you’d need to define even more virtual tape drives, and make them read-only.

But … you’re not guaranteeing that you can simultaneously backup and recover at any time whatsoever. If you’re using Boost devices, you are – a backup can be running to a Boost device and you can kick off a recovery from the same device and NetWorker will run them both concurrently. In fact, NetWorker these days will run multiple backups, multiple restores, multiple clones and multiple stage operations all concurrently to/from the same Boost device. With a VTL, if you need to recover or clone from a virtual tape that is currently being written to, you’ll need to wait until the tape fills or the backup finishes. Even limiting virtual tapes to 50GB or 100GB still creates a pause period that cannot be avoided.

But But … even if you decide all of those limitations are OK, are you prepared to spend the money required to license all of those virtual tape drives that are required? Ahah, you might think – “with a VTL I get an “unlimited” autochanger license, and that covers everything” … not quite right. You see, the Unlimited autochanger license covers you for an unlimited number of slots in your VTL, but NetWorker licenses drive count by the server license + the number of storage node licenses. For a Power Edition server you’ll be automatically licensed for 32 devices on the server, and for any form of storage node, you get 16 devices per storage node. For a Network Edition server, the server device count is just 16. Now, you can stack on storage node licenses without actually creating storage nodes and increase the device count, but let’s go back to that comparison again for a moment … 128 virtual tape drives.

If you’re using NetWorker, Network Edition, that’s:

  • First 16 tape drives – Server License
  • Next 16 tape drives – Storage Node License 1
  • Next 16 tape drives – Storage Node License 2
  • Next 16 tape drives – Storage Node License 3
  • Next 16 tape drives – Storage Node License 4
  • Next 16 tape drives – Storage Node License 5
  • Next 16 tape drives – Storage Node License 6
  • Next 16 tape drives – Storage Node License 7

That’s a lot of storage node licenses, just to license 128 tape drives. Presumably in a VTL configuration you’re either going to have a second VTL to replicate to, or a physical tape library, and you’ll need storage node device counts to cover those devices too.

Phew! If that’s not enough of a reason to go down the path of Boost, then…

Cloning Integration

Let’s say you’ve got two Data Domain systems at different sites, and you want to replicate the data between them to protect against site failure, etc.

If you’re using VTL, you effectively have two options:

  • Use NetWorker to do cloning between the two Data Domains;
  • Use Data Domain replication on the VTL storage folder.

Both of those options, to use the technical term, have considerable “suck” problems.

For the first option – NetWorker cloning – you have to keep in mind that NetWorker doesn’t see the VTL as a deduplication device. Therefore, NetWorker’s going to execute a complete data read from each virtual tape it needs to clone from, and the Data Domain will faithfully rehydrate all the data, allowing NetWorker to send it across the network to be deduplicated and stored on the other host.

However, if you use the second option, then the clone is a perfect volume replica, and if there’s one thing NetWorker doesn’t like, it’s seeing the same volume, with the same volume label and volume ID, in more than one location. Basically you have to keep the media you clone this way out of the virtual library on the remote site. If you want to use it for a recovery, you have to first make the media invisible to NetWorker on the primary site (e.g., via a library export operation), before you can import it within NetWorker. That becomes a tedious task of:

  • Primary site: Export the media to the VTL CAP
  • Secondary site: Use the VTL management software to move the media to the VTL CAP
  • Secondary site: Import the media from the VTL CAP
  • Secondary site: Use media
  • Secondary site: Export the media to the VTL CAP
  • Secondary site: Use the VTL management software to remove the media from the VTL CAP
  • Primary site: Import the media from the VTL CAP

If you compare that to NetWorker cloning controlled Boost replication, you’ll see there’s no comparison at all. NetWorker initiates the clone operation, which triggers, at the back end, a replication job between the two Data Domains. The replica copy on the secondary Data Domain is a fully registered NetWorker copy/clone and accessible to NetWorker while the primary copy is still visible. What’s more, with NetWorker 8.1, you can turn on immediate cloning, whereby as each saveset in a group is finished, NetWorker initiates a Boost Clone of that saveset, shrinking your clone windows considerably.

Phew! If that’s not enough of a reason to go down the path of Boost, then…

Recoverability

The first obvious comment about recoverability is a repeat of the point made in Concurrency. Boost will allow you to run recover sessions at the same time as you’re running backup sessions to the same volumes, and therefore you’re (pun intended) Boosting recoverability within your environment.

But it’s more than that. Some of the advanced recoverability features in NetWorker require disk backup. Virtual tape, for all its use in overcoming architectural problems with physical tape, emulates tape closely enough that NetWorker considers it to be sequential access, and a sequential access device just doesn’t give the flexibility required for the advanced recovery features.

What advanced recovery features am I referring to?

  • Granular recovery from NMM (Exchange, SharePoint, etc.) – This functionality allows you to ‘mount’ a copy of the backup within the environment without actually completing a full restore, and then pull back the individual items you want. If you’re backing up to VTL, you don’t have this option available to you.
  • Block Level Backup in Windows 2012 – Recovery from block level backup similarly does some trickery regarding pseudo-mounting the filesystem for recovery from backup.
  • Instant-On VMware Recovery – OK, this isn’t available in NetWorker, but given its introduction in Avamar, you’d have to think it’s a highly likely contender for availability in NetWorker, and you can bet your next cup of coffee that it’s not going to be available from physical or virtual tape.

Phew! If that’s not enough reason to go down the path of Boost, then…

Reporting

What can I say? I’m in love with the reporting integration between Boost and NetWorker. With a Boost device integrated as a NetWorker target, NetWorker will provide you per client, per client filesystem, and per backup per client filesystem statistics on deduplication:

DD Boost drill down report 1

DD Boost drill down report 2

Of course, you also get overall statistics, such as available capacity on the Data Domain – but in a deduplication environment, being able to drill down to that level of detail on the deduplication statistics on clients and filesystems is an absolute boon.

If you’re using tape, the most you can do is get a report on the Data Domain of the deduplication ratios for each virtual tape. That’s not really useful when it comes to monitoring deduplication performance within the environment.

…but, Fibre vs Ethernet

It used to be that a compelling reason to use VTL over Boost was if you had a site where there’d been heavy investment in the fibre-channel network, but less so in the ethernet network. I.e., you might have gigabit networking for IP, but 8 gigabit for fibre-channel.

However, with NetWorker 8.1, even that reason has been addressed – from 8.1 onwards, NetWorker supports Boost devices over fibre-channel.

Boost your Backups

So there you have it – 4 sets of reasons as to why you’re better off using Boost instead of VTL with Data Domain – and a fifth bonus reason thrown in as well.

So go forth and Boost!

2 thoughts on “Data Domain: VTL vs Boost”

  1. Nice article summing up the differences between DDVTL and DDBoost options, however two facts might need updating:
    – DDBoost Devices (folders) default to 6 Target Sessions and 60 Maximum Sessions on creation;
    – DDBoost over FC (DFC) is currently only limited to Windows (no 2012 as of yet) and Linux platforms, but hopefully soon cross-platform all the way.

    Cheers!

    1. Hi,

      Thanks for the corrections – I vaguely remembered the lack of Windows 2012 support for Boost over FC, but I hadn’t realised about the target sessions/max sessions increase!

      Cheers,
      Preston.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.