Hypervisor Direct – Convergent Data Protection

 Convergent Data Protection, Data Domain  Comments Off on Hypervisor Direct – Convergent Data Protection
Oct 102017
 

At VMworld, DellEMC announced a new backup technology for virtual machines called Hypervisor Direct, which represents a paradigm that I’d refer to as “convergent data protection”, since it mixes layers of data protection to deliver optimal results.

First, I want to get this out of the way: hypervisor direct is not a NetWorker plugin, nor an Avamar plugin. Instead, it’s part of the broader Data Protection Suite package (a good reminder that there are great benefits in the DPS licensing model).

As its name suggests, hypervisor direct is about moving hypervisor backups directly onto protection storage without a primary backup package being involved. This fits under the same model available for Boost Plugins for Databases – centralised protection storage with decentralised access allowing subject matter experts (e.g., database and application administrators) to be in control of their backup processes.

Now, VMware backups are great, but there’s a catch. If you integrate with VMware’s snapshot layer, there’s always a risk of virtual machine stun. The ‘stun’, we refer to there, happens when logged data to the snapshot delta logs are applied to the virtual machine once the snapshot is released. (Hint: if someone tries to tell you otherwise, make like Dorothy in Wizard of Oz and look behind the curtain, because there’s no wizard there.) Within NetWorker and Avamar, we reduce the risk of virtual machine stun significantly by doing optimised backups:

  • Leveraging changed block tracking to only need to access the parts of the virtual machine that have changed since the last backup
  • Using source based deduplication to minimise the amount of data that needs to be sent to protection storage

Those two techniques combined will allow you seamless virtual machine backups in almost all situations – in fact, 90% or more. But, as the old saying goes (I may be making this saying up, bear with me) – it’s that last 10% that’ll really hurt you. In fact, there’s two scenarios that’ll cause virtual machine stun:

  • Inadequate storage performance
  • High virtual machine change rates

In the case of the first scenario, it’s possible to run virtual machines on storage that doesn’t meet their performance requirements. This is particularly so when people are pointing older or under-spec NAS appliances at their virtual machine farm. Now, that may not have a significant impact on day to day operations (other than a bit of user grumbling), but it will be noticed during the snapshot processes around virtual machine backup. Ideally, we want to avoid the first scenario by always having appropriately performing storage for a virtual infrastructure.

Now, the second scenario, that’s more interesting. That’s the “10% that’ll really hurt you”. That’s where a virtualised Oracle or SQL database is 5-10TB with a 40-50% daily change rate. That size, and that change rate will smash you into virtual machine stun territory every time.

Traditionally, the way around that has been one or two (or both) data protection strategies:

  • LUN or array based replication, ignoring the virtual machine layer entirely. That’s good for a secondary copy but it’s going to be at best crash consistent. (It’s also going to be entirely storage dependent – locking you into a vendor and making refreshes more expensive/complex – and will lock you out of technology like vVOL and vSAN.)
  • In-guest agents. That’ll give you your backup, but it’ll be at agent-based performance levels creating additional workload stresses on the virtual machine and the ESX environment. And if we’re talking a multi-TB database with a high change rate – well, that’s not necessarily a good thing to do.

So what’s the way around it? How can you protect those sorts of environments without locking yourself into a storage platform, or preventing yourself from making architectural changes to your overall environment?

You get around it by being a vendor that has a complete continuum of data protection products and creating a convergent data protection solution. That’s what hypervisor direct does.

Hypervisor Direct

Hypervisor direct merges the Boost-direct technology you get in DDBEA and ProtectPoint with RecoverPoint for Virtual Machines (RP4VM). By integrating the backup process in via the Continuous Data Protection (CDP) functionality of RP4VM, we don’t need to take snapshots using VMware at all. That’s right, you can’t get virtual machine stun even in large virtual machines with high IO because we don’t work at that layer. Instead, leveraging the ESXi write splitter technology in RP4VM’s CDP, the RecoverPoint journal system is used to allow a virtual machine backup to be taken, direct to Data Domain, without impact to the source virtual machine.

Do you want to know the really cool feature of this? It’s application consistent, too. That 5-10TB Oracle or SQL database with a high change rate I was talking about earlier? Well, your DBA or Application Administrator gets to run their normal Oracle RMAN backup script for a standard backup, and everything is done at the back-end. That’s right, the Oracle backup or SQL backup (or a host of other databases) triggers the appropriate virtual machine copy functions automatically. (And if a particular database isn’t integrated, there’s still filesystem integration hooks to allow a two-step process.)

This isn’t an incremental improvement to backup options, this is an absolute leapfrog – it’s about enabling efficient, high performance backups in situations where previously there was no actual option available. And it still lets your subject matter experts be involved in the backup process as well.

If you do have virtual machines that fall into this category, reach out to your local DellEMC DPS team for more details. You can also check out some of the official details here.

Data Domain Updates

 Data Domain  Comments Off on Data Domain Updates
Oct 182016
 

I was on annual leave last week (and this week I find myself in Santa Clara).

Needless to say, the big announcements are often seemed to neatly align with when I go on annual leave, and last week was no different – it saw the release of a new set of Data Domain systems, DDOS 6.0, and the new Cloud Tiering functionality.

Cloud Transfer

Now, I know this is a NetWorker blog, but if repeated surveys have shown one consistent thing, it’s that a vast majority of NetWorker environments now have Data Domain in them, and for many good reasons.

You can find the official press release over at Dell EMC, but I’ll dig into a few of the pertinent details.

New Models

The new models that have been released are the 6300, 6800, 9300 and 9800. The key features of the new models are as follows:

  • Data Domain 6300
    • Max throughput per hour using Boost – 24 TB/hr
    • Max usable capacity – 178 TB
  • Data Domain 6800
    • Max throughput per hour using Boost – 32 TB/hr
    • Max usable capacity (active tier) – 288 TB
    • Max addressable Cloud Tier – 576 TB
    • Max total addressable (active + Cloud) – 864 TB
  • Data Domain 9300
    • Max throughput per hour using Boost – 41 TB/hr
    • Max usable capacity (active tier) – 720 TB
    • Max addressable Cloud Tier – 1,440 TB
    • Max total addressable (active + Cloud) – 2,160 TB
  • Data Domain 9800
    • Max throughput per hour using Boost – 68 TB/hr
    • Max usable capacity (active tier) – 1 PB
    • Max addressable Cloud Tier – 2 PB
    • Max total addressable (active + Cloud) – 3 PB

Those are all the sizes of course of actual storage – once your deduplication comes in your logical stored capacity can be considerably higher than the above.

All the models above introduce flash as part of the storage platform for metadata. (If you’re wondering where this will be handy, have a think about Instant Access, the feature where we can power up a Virtual Machine directly from its backup on the Data Domain.)

High Availability was previously only available on the DD9500 – it’s now available on the 6800, 9300, 9500 and 9800, making that extra level of data protection availability accessible to more businesses than ever.

DDOS 6

DDOS 6 is a big release, including the following new features:

  • Cloud Tier (more of that covered further on)
  • Boost FS Plugin – Allows a Linux host with an NFS mount from a DDOS 6 system to participate in Boost, reducing the amount of data that has to be sent over the filesystem mount to Data Domain storage
  • Enhancements to Secure Multi-Tenancy
  • Improvements to garbage collection/filesystem cleaning (and remember, it’s still something that can be run while other operations are taking place!)
  • Improvements to replication performance, speeding up virtual synthetic replication further
  • Support for ProtectPoint on systems with extended retention
  • Support for ProtectPoint on high availability systems
  • New minimally disruptive upgrades – starting in this release, individual software components will be able to be upgraded without full system reboots (unless otherwise required). This will reduce downtime requirements for upgrades and allow for more incremental approaches to upgrades.
  • Client groups – manage/monitor client activity and workflows for collections of clients, either at the Boost or NFS level. This includes being able to set hard/soft limits on stream counts, reporting on client group activities, and logging by client group. You can have up to 64 client groups per platform. (I can see every Data Domain administrator where DBAs are using DDBoost wanting to upgrade for this feature.)

Cloud Tier

Cloud Tier allows the Data Domain system to directly interface with compatible object storage systems and is primarily targeted for handling long term retention workloads. Data lands on the active tier still, but policies can be established to push that data you’re retaining for your long term retention out to cloud/object storage. While it supports storage such as Amazon and Azure, the real cost savings actually come in when you consider using it with Elastic Cloud Storage (ECS). (I’ve already been involved in deals where we’ve easily shown a 3-year TCO being substantially cheaper for a customer on ECS than Amazon S3+Glacier.)

But hang on, you might be asking – what about CloudBoost? Well, CloudBoost is still around and it still has a variety of use cases, but Cloud Tier is about having the Data Domain do the movement of data automatically – and without any need to rehydrate outgoing data. It’s also ideal for mixed access workloads.

Cloud Tier

Cloud Tier enables the Data Domain to actually address twice the maximum active tier capacity for any Data Domain model in object storage, drastically increasing the overall logical amount of data that can be stored on a model by model basis, and by pushing deduplicated data out to object storage, the processing time for data movement is unparalleled.

Summary/Wrapping Up

It was a big week for Data Domain, and DDOS 6 is setting the bar for deduplication systems – as well as laying the groundwork for even more enhancements over time.

(On a side node – apologies for the delay in posts. Leading up to taking that week off I was swamped.)

Mar 092016
 

I’ve been working with backups for 20 years, and if there’s been one constant in 20 years I’d say that application owners (i.e., DBAs) have traditionally been reluctant to have other people (i.e., backup administrators) in control of the backup process for their databases. This leads to some environments where the DBAs maintain control of their backups, and others where the backup administrators maintain control of the database backups.

Junction

So the question that many people end up asking is: which way is the right way? The answer, in reality is a little fuzzy, or, it depends.

When we were primarily backing up to tape, there was a strong argument for backup administrators to be in control of the process. Tape drives were a rare commodity needing to be used by a plethora of systems in a backup environment, and with big demands placed on them. The sensible approach was to fold all database backups into a common backup scheduling system so resources could be apportioned efficiently and fairly.

DB Backups with Tape

Traditional backups to tape via a backup server

With limited tape resources and a variety of systems to protect, backup administrators needed to exert reasonably strong controls over what backed up when, and so in a number of organisations it was common to have database backups controlled within the backup product (e.g., NetWorker), with scheduling negotiated between the backup and database administrators. Where such processes have been established, they often continue – backups are, of course, a reasonably habitual process (and for good cause).

For some businesses though, DBAs might feel there was not enough control over the backup process – which might be agreed with based on the mission criticality of the applications running on top of the database, or because of the perceived licensing costs associated with using a plugin or module from the backup product to backup the database. So in these situations if a tape library or drives weren’t allocated directly to the database, the “dump and sweep” approach became quite common, viz.:

Dump and Sweep

Dump and Sweep

One of the most pervasive results of the “dump and sweep” methodology however is the amount of primary storage it uses. Due to it being much faster than tape, database administrators would often get significantly larger areas of storage – particularly as storage became cheaper – to conduct their dumps to. Instead of one or two days, it became increasingly common to have anywhere from 3-5 days of database dumps sitting on primary storage being swept up nightly by a filesystem backup agent.

Dump and sweep of course poses problems: in addition to needing large amounts of primary storage, the first backup for the database is on-platform – there’s no physical separation. That means the timing of getting the database backup completed before the filesystem sweep starts is critical. However, the timing for the dump is controlled by the DBA and dependent on the database load and the size of the database, whereas the timing of the filesystem backup is controlled by the backup administrator. This would see many environments spring up where over time the database grew to a size it wouldn’t get an off-platform backup for 24 hours – until the next filesystem backup happened. (E.g., a dump originally taking an hour to complete would be started at 19:00. The backup administrators would start the filesystem backup at 20:30, but over time the database backups would grow and wouldn’t complete until say, 21:00. Net result could be a partial or failed backup of the dump files the first night, with the second night being the first successful backup of the dump.)

Over time backup to disk entered popularity to overcome the overnight operational challenges of tape, then grew, and eventually the market has expanded to include deduplication storage, purpose built backup appliances and even when I’d normally consider to be integrated data protection appliances – ones where the intelligence (e.g., deduplication functionality) is extended out from the appliance to the individual systems being protected. That’s what we get, for instance, with Data Domain: the Boost functionality embedded in APIs on the client systems leveraging distributed segment processing to have everything being backed up participate in its own deduplication. The net result is one that scales better than the traditional 3-tier “client/server/{media server|storage node}” environment, because we’re scaling where it matters: out at the hosts being protected and up at protection storage, rather than adding a series of servers in the middle to manage bottlenecks. (I.e., we remove the bottlenecks.)

Even as large percentages of businesses switched to deduplicated storage – Data Domains mostly from a NetWorker perspective – and had the capability of leveraging distributed deduplication processes to speed up the backups, that legacy “dump and sweep” approach, if it had been in the business, often remained in the business.

We’re far enough into this now that I can revisit the two key schools of thought within data protection:

  • Backup administrators should schedule and control backups regardless of the application being backed up
  • Subject Matter Experts (SMEs) should have some control over their application backup process because they usually deeply understand how the business functions leveraging the application work

I’d suggest that the smaller the business, the more correct the first option is – or rather, when an environment is such that DBAs are contracted or outsourced in particular, having the backup administrator in charge of the backup process is probably more important to the business. But that creates a requirement for the backup administrator to know the ins and outs of backing up and recovering the application/database almost as deeply as a DBA themselves.

As businesses grow in size and as the number of mission critical systems sitting on top of databases/applications grow, there’s equally a strong opinion the second argument is correct: the SMEs need to be intimately involved in the backup and recovery process. Perhaps even more so, in a larger backup environment, you don’t want your backup administrators to actually be bottlenecks in a disaster situation (and they’d usually agree to this as well – it’s too stressful).

With centralised disk based protection storage – particularly deduplicating protection storage – we can actually get the best of both worlds now though. The backup administrators can be in control of the protection storage and set broad guidance on data protection at an architectural and policy level for much of the environment, but the DBAs can leverage that same protection storage and fold their backups into the overall requirements of their application. (This might be to even leverage third party job control systems to only trigger backups once batch jobs or data warehousing tasks have completed.)

Backup Process With Data Domain and Backup Server

Backup Process With Data Domain and Backup Server

That particular flow is great for businesses that have maintained centralised control over the backup process of databases and applications, but what about those where dump and sweep has been the design principle, and there’s a desire to keep a strong form of independence on the backup process, or where the overriding business goal is to absolutely limit the number of systems database administrators need to learn so they can focus on their job? They’re definitely legitimate approaches – particularly so in larger environments with more mission critical systems.

That’s why there’s the Data Domain Boost plugins for Applications and Databases – covering SAP, DB2, Oracle, SQL Server, etc. That gives a slightly different architecture, viz.:

DB Backups with Boost Plugin

DB Backups with Boost Plugin

In that model, the backup server (e.g., NetWorker) still controls and coordinates the majority of the backups in the environment, but the Boost Plugin for Databases/Applications is used on the database servers instead to allow complete integration between the DBA tools and the backup process.

So returning to the initial question – which way is right?

Well, that comes down to the real question: which way is right for your business? Pull any emotion or personal preferences out of the question and look at the real architectural requirements of the business, particularly relating to mission critical applications. Which way is the right way? Only your business can decide.

Here’s a thought I’ll leave you with though: there’s two critical components to being able to make the choice completely based on business requirements:

  • You need centralised protection storage where there aren’t the traditional (tape-inherited) limitations on concurrent device access
  • You need a data protection framework approach rather than a data protection monolith approach

The former allows you to make decisions without being impeded by arbitrary practical/physical limitations (e.g., “I can’t read from a tape and write to it at the same time”), and more importantly, the latter lets you build an adaptive data protection strategy using best of breed components at the different layers rather than squeezing everything into one box and making compromises at every step of the way. (NetWorker, as I’ve mentioned before, is a framework based backup product – but I’m talking more broadly here: framework based data protection environments.)

Happy choosing!

Dec 222015
 

As we approach the end of 2015 I wanted to spend a bit of time reflecting on some of the data protection enhancements we’ve seen over the year. There’s certainly been a lot!

Protection

NetWorker 9

NetWorker 9 of course was a big part to the changes in the data protection landscape in 2015, but that’s not by any means the only advancement we saw. I covered some of the advances in NetWorker 9 in my initial post about it (NetWorker 9: The Future of Backup), but to summarise just a few of the key new features, we saw:

  • A policy based engine that unites backup, cloning, snapshot management and protection of virtualisation into a single, easy to understand configuration. Data protection activities in NetWorker can be fully aligned to service catalogue requirements, and the easier configuration engine actually extends the power of NetWorker by offering more complex configuration options.
  • Block based backups for Linux filesystems – speeding up backups for highly dense filesystems considerably.
  • Block based backups for Exchange, SQL Server, Hyper-V, and so on – NMM for NetWorker 9 is a block based backup engine. There’s a whole swathe of enhancements in NMM version 9, but the 3-4x backup performance improvement has to be a big win for organisations struggling against existing backup windows.
  • Enhanced snapshot management – I was speaking to a customer only a few days ago about NSM (NetWorker Snapshot Management), and his reaction to NSM was palpable. Wrapping NAS snapshots into an effective and coordinated data protection policy with the backup software orchestrating the whole process from snapshot creation, rollover to backup media and expiration just makes sense as the conventional data storage protection and backup/recovery activities continue to converge.
  • ProtectPoint Integration – I’ll get to ProtectPoint a little further below, but being able to manage ProtectPoint processes in the same way NSM manages file-based snapshots will be a big win as well for those customers who need ProtectPoint.
  • And more! – VBA enhancements (notably the native HTML5 interface and a CLI for Linux), NetWorker Virtual Edition (NVE), dynamic parallel savestreams, NMDA enhancements, restricted datazones and scaleability all got a boost in NetWorker 9.

It’s difficult to summarise everything that came in NetWorker 9 in so few words, so if you’ve not read it yet, be sure to check out my essay-length ‘summary’ of it referenced above.

ProtectPoint

In the world of mission critical databases where impact minimisation on the application host is a must yet backup performance is equally a must, ProtectPoint is an absolute game changer. To quote Alyanna Ilyadis, when it comes to those really important databases within a business,

“Ideally, you’d want the performance of a snapshot, with the functionality of a backup.”

Think about the real bottleneck in a mission critical database backup: the data gets transferred (even best case) via fibre-channel from the storage layer to the application/database layer before being passed across to the data protection storage. Even if you direct-attach data protection storage to the application server, or even if you mount a snapshot of the database at another location, you still have the fundamental requirement to:

  • Read from production storage into a server
  • Write from that server out to protection storage

ProtectPoint cuts the middle-man out of the equation. By integrating storage level snapshots with application layer control, the process effectively becomes:

  • Place database into hot backup mode
  • Trigger snapshot
  • Pull database out of hot backup mode
  • Storage system sends backup data directly to Data Domain – no server involved

That in itself is a good starting point for performance improvement – your database is only in hot backup mode for a few seconds at most. But then the real power of ProtectPoint kicks in. You see, when you first configure ProtectPoint, a block based copy from primary storage to Data Domain storage starts in the background straight away. With Change Block Tracking incorporated into ProtectPoint, the data transfer from primary to protection storage kicks into high gear – only the changes between the last copy and the current state at the time of the snapshot need to be transferred. And the Data Domain handles creation of a virtual synthetic full from each backup – full backups daily at the cost of an incremental. We’re literally seeing backup performance improvements in the order of 20x or more with ProtectPoint.

There’s some great videos explaining what ProtectPoint does and the sorts of problems it solves, and even it integrating into NetWorker 9.

Database and Application Agents

I’ve been in the data protection business for nigh on 20 years, and if there’s one thing that’s remained remarkably consistent throughout that time it’s that many DBAs are unwilling to give up control over the data protection configuration and scheduling for their babies.

It’s actually understandable for many organisations. In some places its entrenched habit, and in those situations you can integrate data protection for databases directly into the backup and recovery software. For other organisations though there’s complex scheduling requirements based on batch jobs, data warehousing activities and so on which can’t possibly be controlled by a regular backup scheduler. Those organisations need to initiate the backup job for a database not at a particular time, but when it’s the right time, and based on the amount of data or the amount of processing, that could be a highly variable time.

The traditional problem with backups for databases and applications being handled outside of the backup product is the chances of the backup data being written to primary storage, which is expensive. It’s normally more than one copy, too. I’d hazard a guess that 3-5 copies is the norm for most database backups when they’re being written to primary storage.

The Database and Application agents for Data Domain allow a business to sidestep all these problems by centralising the backups for mission critical systems onto highly protected, cost effective, deduplicated storage. The plugins work directly with each supported application (Oracle, DB2, Microsoft SQL Server, etc.) and give the DBA full control over managing the scheduling of the backups while ensuring those backups are stored under management of the data protection team. What’s more, primary storage is freed up.

Formerly known as “Data Domain Boost for Enterprise Applications” and “Data Domain Boost for Microsoft Applications”, the Database and Application Agents respectively reached version 2 this year, enabling new options and flexibility for businesses. Don’t just take my word for it though: check out some of the videos about it here and here.

CloudBoost 2.0

CloudBoost version 1 was released last year and I’ve had many conversations with customers interested in leveraging it over time to reduce their reliance on tape for long term retention. You can read my initial overview of CloudBoost here.

2015 saw the release of CloudBoost 2.0. This significantly extends the storage capabilities for CloudBoost, introduces the option for a local cache, and adds the option for a physical appliance for businesses that would prefer to keep their data protection infrastructure physical. (You can see the tech specs for CloudBoost appliances here.)

With version 2, CloudBoost can now scale to 6PB of cloud managed long term retention, and every bit of that data pushed out to a cloud is deduplicated, compressed and encrypted for maximum protection.

Spanning

Cloud is a big topic, and a big topic within that big topic is SaaS – Software as a Service. Businesses of all types are placing core services in the Cloud to be managed by providers such as Microsoft, Google and Salesforce. Office 365 Mail is proving very popular for businesses who need enterprise class email but don’t want to run the services themselves, and Salesforce is probably the most likely mission critical SaaS application you’ll find in use in a business.

So it’s absolutely terrifying to think that SaaS providers don’t really backup your data. They protect their infrastructure from physical faults, and their faults, but their SLAs around data deletion are pretty straight forward: if you deleted it, they can’t tell whether it was intentional or an accident. (And if it was an intentional delete they certainly can’t tell if it was authorised or not.)

Data corruption and data deletion in SaaS applications is far too common an occurrence, and for many businesses sadly it’s only after that happens for the first time that people become aware of what those SLAs do and don’t cover them for.

Enter Spanning. Spanning integrates with the native hooks provided in Salesforce, Google Apps and Office 365 Mail/Calendar to protect the data your business relies on so heavily for day to day operations. The interface is dead simple, the pricing is straight forward, but the peace of mind is priceless. 2015 saw the introduction of Spanning for Office 365, which has already proven hugely popular, and you can see a demo of just how simple it is to use Spanning here.

Avamar 7.2

Avamar got an upgrade this year, too, jumping to version 7.2. Virtualisation got a big boost in Avamar 7.2, with new features including:

  • Support for vSphere 6
  • Scaleable up to 5,000 virtual machines and 15+ vCenters
  • Dynamic policies for automatic discovery and protection of virtual machines within subfolders
  • Automatic proxy deployment: This sees Avamar analyse the vCenter environment and recommend where to place virtual machine backup proxies for optimum efficiency. Particularly given the updated scaleability in Avamar for VMware environments taking the hassle out of proxy placement is going to save administrators a lot of time and guess-work. You can see a demo of it here.
  • Orphan snapshot discovery and remediation
  • HTML5 FLR interface

That wasn’t all though – Avamar 7.2 also introduced:

  • Enhancements to the REST API to cover tenant level reporting
  • Scheduler enhancements – you can now define the start dates for your annual, monthly and weekly backups
  • You can browse replicated data from the source Avamar server in the replica pair
  • Support for DDOS 5.6 and higher
  • Updated platform support including SLES 12, Mac OS X 10.10, Ubuntu 12.04 and 14.04, CentOS 6.5 and 7, Windows 10, VNX2e, Isilon OneFS 7.2, plus a 10Gbe NDMP accelerator

Data Domain 9500

Already the market leader in data protection storage, EMC continued to stride forward with the Data Domain 9500, a veritable beast. Some of the quick specs of the Data Domain 9500 include:

  • Up to 58.7 TB per hour (when backing up using Boost)
  • 864TB usable capacity for active tier, up to 1.7PB usable when an extended retention tier is added. That’s the actual amount of storage; so when deduplication is added that can yield actual protection data storage well into the multiple-PB range. The spec sheet gives some details based on a mixed environment where the data storage might be anywhere from 8.6PB to 86.4PB
  • Support for traditional ES30 shelves and the new DS60 shelves.

Actually it wasn’t just the Data Domain 9500 that was released this year from a DD perspective. We also saw the release of the Data Domain 2200 – the replacement for the SMB/ROBO DD160 appliance. The DD2200 supports more streams and more capacity than the previous entry-level DD160, being able to scale from a 4TB entry point to 24TB raw when expanded to 12 x 2TB drives. In short: it doesn’t matter whether you’re a small business or a huge enterprise: there’s a Data Domain model to suit your requirements.

Data Domain Dense Shelves

The traditional ES30 Data Domain shelves have 15 drives. 2015 also saw the introduction of the DS60 – dense shelves capable of holding sixty disks. With support for 4 TB drives, that means a single 5RU data Domain DS60 shelf can hold as much as 240TB in drives.

The benefits of high density shelves include:

  • Better utilisation of rack space (60 drives in one 5RU shelf vs 60 drives in 4 x 3RU shelves – 12 RU total)
  • More efficient for cooling and power
  • Scale as required – each DS60 takes 4 x 15 drive packs, allowing you to start with just one or two packs and build your way up as your storage requirements expand

DDOS 5.7

Data Domain OS 5.7 was also released this year, and includes features such as:

  • Support for DS60 shelves
  • Support for 4TB drives
  • Support for ES30 shelves with 4TB drives (DD4500+)
  • Storage migration support – migrate those older ES20 style shelves to newer storage while the Data Domain stays online and in use
  • DDBoost over fibre-channel for Solaris
  • NPIV for FC, allowing up to 8 virtual FC ports per physical FC port
  • Active/Active or Active/Passive port failover modes for fibre-channel
  • Dynamic interface groups are now supported for managed file replication and NAT
  • More Secure Multi-Tenancy (SMT) support, including:
    • Tenant-units can be grouped together for a tenant
    • Replication integration:
      • Strict enforcing of replication to ensure source and destination tenant are the same
      • Capacity quota options for destination tenant in a replica context
      • Stream usage controls for replication on a per-tenant basis
    • Configuration wizards support SMT for
    • Hard limits for stream counts per Mtree
    • Physical Capacity Measurement (PCM) providing space utilisation reports for:
      • Files
      • Directories
      • Mtrees
      • Tenants
      • Tenant-units
  • Increased concurrent Mtree counts:
    • 256 Mtrees for Data Domain 9500
    • 128 Mtrees for each of the DD990, DD4200, DD4500 and DD7200
  • Stream count increases – DD9500 can now scale to 1,885 simultaneous incoming streams
  • Enhanced CIFS support
  • Open file replication – great for backups of large databases, etc. This allows the backup to start replicating before it’s even finished.
  • ProtectPoint for XtremIO

Data Protection Suite (DPS) for VMware

DPS for VMware is a new socket-based licensing model for mid-market businesses that are highly virtualized and want an effective enterprise-grade data protection solution. Providing Avamar, Data Protection Advisor and RecoverPoint for Virtual Machines, DPS for VMware is priced based on the number of CPU sockets (not cores) in the environment.

DPS for VMware is ideally suited for organisations that are either 100% virtualised or just have a few remaining machines that are physical. You get the full range of Avamar backup and recovery options, Data Protection Advisor to monitor and report on data protection status, capacity and trends within the environment, and RecoverPoint for a highly efficient journaled replication of critical virtual machines.

…And one minor thing

There was at least one other bit of data protection news this year, and that was me finally joining EMC. I know in the grand scheme of things it’s a pretty minor point, but after years of wanting to work for EMC it felt like I was coming home. I had worked in the system integrator space for almost 15 years and have a great appreciation for the contribution integrators bring to the market. That being said, getting to work from within a company that is so focused on bringing excellent data protection products to the market is an amazing feeling. It’s easy from the outside to think everything is done for profit or shareholder value, but EMC and its employees have a real passion for their products and the change they bring to IT, business and the community as a whole. So you might say that personally, me joining EMC was the biggest data protection news for the year.

In Summary

I’m willing to bet I forgot something in the list above. It’s been a big year for Data Protection at EMC. Every time I’ve turned around there’s been new releases or updates, new features or functions, and new options to ensure that no matter where the data is or how critical the data is to the organisation, EMC has an effective data protection strategy for it. I’m almost feeling a little bit exhausted having come up with the list above!

So I’ll end on a slightly different note (literally). If after a long year working with or thinking about Data Protection you want to chill for five minutes, listen to Kate Miller-Heidke’s cover of “Love is a Stranger”. She’s one of the best artists to emerge from Australia in the last decade. It’s hard to believe she did this cover over two years ago now, but it’s still great listening.

I’ll see you all in 2016! Oh, and don’t forget the survey.

Architectural implications of in-flight Data Domain Boost encryption

 Architecture, Data Domain, NetWorker  Comments Off on Architectural implications of in-flight Data Domain Boost encryption
Aug 252014
 

Between the Boost libraries included in NetWorker 8.2, and the in-flight encryption functionality added in Data Domain OS 5.5, it’s now possible to configure Boost-encrypted backups – and there’ll be quite a few sites with rigorous security requirements for which that’s a highly desirable function.

But it’s important to keep in mind that there are some architectural considerations to consider in this scenario, and the most important ones are about when the backup is encrypted.

Currently in NetWorker 8.2, there are no options for controlling the selection of  in-flight encryption. This is done entirely at the Data Domain by configuring client groups. In essence, you can configure collections of clients at the Data Domain to either have encryption of either:

  • none
  • medium
  • high

This is accomplished through using the command:

# ddboost clients add client-list [encryption-strength {medium|high}]

and:

# ddboost clients modify client-list [encryption-strength {none|medium|high}]

The astute reader will immediately identify the key restrictions this places: because all the control is being done from the Data Domain, it applies only to client backups that interface directly with the Data Domain.

For conventional backups, this means that Boost in-flight encryption will only apply to client-direct backups, i.e.:

Client Direct Data Flow

In this scenario, with data flowing directly between the NetWorker client and the Data Domain server, in-flight encryption is available. However, when a storage node is inserted into the data flow, that changes:

Traditional client data flow

 

In this scenario, the traffic sent between the client and the storage node does not utilise the Boost libraries, and thus flows unencrypted. The data is subsequently encrypted in transit between the storage node and the Data Domain, but by this point an organisation requiring in-flight encryption of backup traffic is already likely remiss in its obligations.

There are other scenarios where in-flight encryption is feasible:

  • VBA with ‘hotadd’ transport mode – this connects the ESX datastores directly to the VBA appliance (or its proxies) for the purposes of backup, keeping the initial data traffic within primary storage. The traffic sent from the VBA appliance or proxies, utilising Boost, gets encrypted as required. However, those Data Domain ddboost commands for enabling encryption will apply to the VBA appliance and its proxies – not individual virtual machines. Thus, encryption will be for all virtual machines backed up by the appliances (or none), and nothing in-between.
  • VADP on an 8.2 storage node using ‘SAN’, ‘hotadd’ or ‘NBDSSL’ transport modes:
    • ‘SAN’ transport mode is for physical storage nodes where the ESX LUNs are fibre-channel connected to the storage node; thus, data transport to the VADP proxy will be over fibre-channel, and the IP traffic will be encrypted via Boost between the storage node and the Data Domain;
    • ‘hotadd’ – same as VBA;
    • ‘NBDSSL’ – in this mode, the data is sent from the ESX server to the VADP proxy via IP, but in encrypted traffic; Boost level encryption subsequently takes over for transmission from the VADP proxy to the Data Domain.
  • An up-to-date Data Domain Boost plugin (e.g., Oracle RMAN Plugin for Data Domain) – though technically, this is fully outside of NetWorker.

Boost based in-flight encryption is a great addition in a NetWorker environment, but it’s important to keep in mind the restrictions regarding when it will apply.

Feb 242014
 

One question that comes up every now and then concerns having an optimal approach to Data Domain Boost devices in NetWorker when doing both daily and monthly backups.

Under NetWorker 7.x and lower, when the disk backup architecture was considerably less capable (resulting in just one read nsrmmd and one write nsrmmd for each ADV_FILE or Data Domain device) it was invariably the case that you’d end up with quite a few devices, with typically no more than 4 as the target/max sessions setting for each device.

With NetWorker 8 having the capability of running multiple nsrmmds per device, the architectural reasons around splitting disk backup have diminished. For ADV_FILE devices, unless you’re using a good journaling filesystem that can recover quickly from a crash, you’re likely still going to need multiple filesystems to avoid the horror of a crash resulting in a 8+ hour filesystem check. (For example, on Linux I tend to use XFS as the filesystem for ADV_FILE devices for precisely this reason.)

Data Domain is not the same as conventional ADV_FILE devices. Regardless of whether you allocate 1 or 20 devices in NetWorker from a Data Domain server, there’s no change in LUN mappings or background disk layouts. It’s all a single global storage pool. What I’m about to outline is what I’d call an optimal solution for daily and monthly backups using boost. (As is always the case, you’ll find exceptions to every rule, and NetWorker lets you achieve the same result using a myriad of different techniques, so there are potentially other equally optimal solutions.)

Pictorially, this will resemble the following:

Optimal Dailies and Monthlies with Data Domain Boost

Optimal Dailies and Monthlies with Data Domain Boost

The daily backups will be kept on disk for their entire lifetime, and the monthly backups will be kept on disk for a while, but cloned out to tape so that they can be removed from disk to preserve space over time.

A common enough approach under NetWorker 7.6 and below was to have a bunch of devices defined at each site, half for daily backups and half for monthly backups, before any clone devices were factored into consideration.

These days, between scheduled cloning policies and Data Domain boost, it can be a whole lot simpler.

All the “Daily” groups and all the “Monthly” groups can write to the same backup device in each location. Standard group based cloning will be used to copy the backup data from one site to the other – NetWorker/Boost controlled replication. (If you’re using NetWorker 8.1, you can even enable the option to have NetWorker trigger the cloning on a per saveset basis within the group, rather than waiting for each group to end before cloning is done.)

If you only want the backups from the Monthly groups to stay on the disk devices for the same length of time as the Daily retention period, you’ve got a real winning situation – you can add an individual client to both the relevant Daily and Monthly groups, with the client having the daily retention period assigned to it. If you want the backups from the Monthly groups to stay on disk, it’ll be best to keep two separate client definitions for each client – one with the daily retention period, and one with the on-disk monthly retention period.

Monthly backups would get cloned to tape using scheduled clone policies. For the backups that need to be transferred out to tape for longer-term retention, you make use of the option to set both browse and retention time for the cloned savesets. (You can obviously also make use of the copies option for scheduled cloning operations and generate two tape copies for when the disk copy expires.)

In this scenario, the Monthly backups are written to disk with a shorter retention period, but cloned out to tape with the true long-term retention. This ensures that the disk backup capacity is managed automatically by NetWorker while long-term backups are stored for their required retention period.

Back to the Data Domain configuration however, the overall disk backup configuration itself is quite straight forward: with multiple nsrmmd processes running per device, the same result is achieved with one Data Domain Boost device as would have been achieved with multiple Boost devices under 7.6.x and lower.

Virtual synthetic fulls: A match made in heaven

 NetWorker  Comments Off on Virtual synthetic fulls: A match made in heaven
Oct 302013
 

One of the great new features in NetWorker 8.1 is virtual synthetic fulls.

You may think the term ‘virtual synthetic’ sounds a bit meta, but it’s quite accurate when it refers to synthetic fulls being virtually constructed on Data Domain devices.

You see, while synthetic fulls were introduced in NetWorker 8.0, they had a catch when running on Data Domain Boost devices – the data would still be rehydrated. That is, NetWorker would read through the full and all incremental backups referenced and rehydrate the data to construct a new full backup, saving that full on the Data Domain (with inline deduplication obviously happening again).

There’s nothing incorrect about that process – it is, after all, exactly what happens on standard AFTD devices. But it’s not the most efficient way of going about it.

That’s where NetWorker 8.1 and virtual synthetic fulls come into play. In this scenario, NetWorker instructs the Data Domain to create a new full backup via Boost. For the Data Domain, this consists of giving NetWorker back the appropriate reference data for the saveset after (mostly) jiggling pointers and updating its own back-end file data.

The net result is speed. Virtual synthetic fulls are fast on Data Domain because there’s very little data being rehydrated – deduplicated data is just being adjusted. You can see how fast this is because NetWorker reports the operation as if it were happening as realtime data:

virtual synthetic fullThat’s 1860236 KB/s being reported … 1816 MB/s throughput, and that’s just what I managed to capture – it was a smaller saveset, and it was pumping along. But the client itself in question was connected over 802.11n. There’s no way it could be running at that speed over the ether – the speed was coming from the operation being performed on the Data Domain itself – and that Data Domain had a bunch of other activities happening at the same time.

Synthetic fulls are useful under NetWorker 8.0, but in 8.1 and combined with Data Domain, they can be a complete game changer to your backup windows.

 

Oct 102013
 

If you’re looking at deploying Data Domain with NetWorker, but are coming from a physical tape or even an alternate virtual tape environment, you may look at the systems and think: “Should I use VTL or Boost?”

These days the answer is actually fairly simple: use Boost as much as possible.

If this were a True/False exam, that’d be the end of the article, but I think if you’re not sure what to do, you’ll want to see my working – how I got to that point.

The reasons are multiple, and I’ll break them down as follows:

  • Concurrency;
  • Cloning Integration;
  • Recoverability;
  • Reporting.

Once I’ve gone through them, you’ll see that each of those items on their own would likely represent a sufficient reason to choose Boost over VTL.

Concurrency

As I’ve said in the past, and as is well known, a virtual tape library will emulate all the features of a tape library, good and bad. This is perhaps most obvious when it comes to concurrency – that is, simultaneous read and write operations.

Let’s consider the difference between a VTL and a Boost configuration. We’ll start with the Boost configuration. We won’t worry too much about the actual capacity of the units, but let’s assume they’re the same.

From a Boost perspective, you might present this to your backup server as 4 Data Domain Boost folders. For standard Boost devices, this will see target sessions set to 4 per folder and max sessions set to 32. That’s a total in a default configuration of 128 streams going to the Data Domain. However, NetWorker doesn’t count recovery streams in that session count, so that’s a total of 128 maximum write streams, and (theoretically) as many recovery streams as you wanted to run.

If we consider it from a VTL perspective, it gets a little tricky. The primary reason for that is that the way NetWorker multiplexes for tape is fundamentally incompatible with deduplication, regardless of whether you’re writing to a Data Domain, Quantum, HP or any other deduplication device. That means each virtual tape drive you create has to be limited to a max sessions count of … one.

That means, if you wanted to write 32 save sets simultaneously to the Data Domain VTL you’d have to define 32 virtual tape drives. Or if you wanted to match the max sessions defined by the Boost variant, you’d be defining 128 virtual tape drives.

With virtual tape drives of course, emulating the best and the worst of tape, each of those virtual tape drives can either read or write, but not both (at least not concurrently). So if you wanted to ‘guarantee’ being able to run a recovery at the same time as a backup, you’d need to define even more virtual tape drives, and make them read-only.

But … you’re not guaranteeing that you can simultaneously backup and recover at any time whatsoever. If you’re using Boost devices, you are – a backup can be running to a Boost device and you can kick off a recovery from the same device and NetWorker will run them both concurrently. In fact, NetWorker these days will run multiple backups, multiple restores, multiple clones and multiple stage operations all concurrently to/from the same Boost device. With a VTL, if you need to recover or clone from a virtual tape that is currently being written to, you’ll need to wait until the tape fills or the backup finishes. Even limiting virtual tapes to 50GB or 100GB still creates a pause period that cannot be avoided.

But But … even if you decide all of those limitations are OK, are you prepared to spend the money required to license all of those virtual tape drives that are required? Ahah, you might think – “with a VTL I get an “unlimited” autochanger license, and that covers everything” … not quite right. You see, the Unlimited autochanger license covers you for an unlimited number of slots in your VTL, but NetWorker licenses drive count by the server license + the number of storage node licenses. For a Power Edition server you’ll be automatically licensed for 32 devices on the server, and for any form of storage node, you get 16 devices per storage node. For a Network Edition server, the server device count is just 16. Now, you can stack on storage node licenses without actually creating storage nodes and increase the device count, but let’s go back to that comparison again for a moment … 128 virtual tape drives.

If you’re using NetWorker, Network Edition, that’s:

  • First 16 tape drives – Server License
  • Next 16 tape drives – Storage Node License 1
  • Next 16 tape drives – Storage Node License 2
  • Next 16 tape drives – Storage Node License 3
  • Next 16 tape drives – Storage Node License 4
  • Next 16 tape drives – Storage Node License 5
  • Next 16 tape drives – Storage Node License 6
  • Next 16 tape drives – Storage Node License 7

That’s a lot of storage node licenses, just to license 128 tape drives. Presumably in a VTL configuration you’re either going to have a second VTL to replicate to, or a physical tape library, and you’ll need storage node device counts to cover those devices too.

Phew! If that’s not enough of a reason to go down the path of Boost, then…

Cloning Integration

Let’s say you’ve got two Data Domain systems at different sites, and you want to replicate the data between them to protect against site failure, etc.

If you’re using VTL, you effectively have two options:

  • Use NetWorker to do cloning between the two Data Domains;
  • Use Data Domain replication on the VTL storage folder.

Both of those options, to use the technical term, have considerable “suck” problems.

For the first option – NetWorker cloning – you have to keep in mind that NetWorker doesn’t see the VTL as a deduplication device. Therefore, NetWorker’s going to execute a complete data read from each virtual tape it needs to clone from, and the Data Domain will faithfully rehydrate all the data, allowing NetWorker to send it across the network to be deduplicated and stored on the other host.

However, if you use the second option, then the clone is a perfect volume replica, and if there’s one thing NetWorker doesn’t like, it’s seeing the same volume, with the same volume label and volume ID, in more than one location. Basically you have to keep the media you clone this way out of the virtual library on the remote site. If you want to use it for a recovery, you have to first make the media invisible to NetWorker on the primary site (e.g., via a library export operation), before you can import it within NetWorker. That becomes a tedious task of:

  • Primary site: Export the media to the VTL CAP
  • Secondary site: Use the VTL management software to move the media to the VTL CAP
  • Secondary site: Import the media from the VTL CAP
  • Secondary site: Use media
  • Secondary site: Export the media to the VTL CAP
  • Secondary site: Use the VTL management software to remove the media from the VTL CAP
  • Primary site: Import the media from the VTL CAP

If you compare that to NetWorker cloning controlled Boost replication, you’ll see there’s no comparison at all. NetWorker initiates the clone operation, which triggers, at the back end, a replication job between the two Data Domains. The replica copy on the secondary Data Domain is a fully registered NetWorker copy/clone and accessible to NetWorker while the primary copy is still visible. What’s more, with NetWorker 8.1, you can turn on immediate cloning, whereby as each saveset in a group is finished, NetWorker initiates a Boost Clone of that saveset, shrinking your clone windows considerably.

Phew! If that’s not enough of a reason to go down the path of Boost, then…

Recoverability

The first obvious comment about recoverability is a repeat of the point made in Concurrency. Boost will allow you to run recover sessions at the same time as you’re running backup sessions to the same volumes, and therefore you’re (pun intended) Boosting recoverability within your environment.

But it’s more than that. Some of the advanced recoverability features in NetWorker require disk backup. Virtual tape, for all its use in overcoming architectural problems with physical tape, emulates tape closely enough that NetWorker considers it to be sequential access, and a sequential access device just doesn’t give the flexibility required for the advanced recovery features.

What advanced recovery features am I referring to?

  • Granular recovery from NMM (Exchange, SharePoint, etc.) – This functionality allows you to ‘mount’ a copy of the backup within the environment without actually completing a full restore, and then pull back the individual items you want. If you’re backing up to VTL, you don’t have this option available to you.
  • Block Level Backup in Windows 2012 – Recovery from block level backup similarly does some trickery regarding pseudo-mounting the filesystem for recovery from backup.
  • Instant-On VMware Recovery – OK, this isn’t available in NetWorker, but given its introduction in Avamar, you’d have to think it’s a highly likely contender for availability in NetWorker, and you can bet your next cup of coffee that it’s not going to be available from physical or virtual tape.

Phew! If that’s not enough reason to go down the path of Boost, then…

Reporting

What can I say? I’m in love with the reporting integration between Boost and NetWorker. With a Boost device integrated as a NetWorker target, NetWorker will provide you per client, per client filesystem, and per backup per client filesystem statistics on deduplication:

DD Boost drill down report 1

DD Boost drill down report 2

Of course, you also get overall statistics, such as available capacity on the Data Domain – but in a deduplication environment, being able to drill down to that level of detail on the deduplication statistics on clients and filesystems is an absolute boon.

If you’re using tape, the most you can do is get a report on the Data Domain of the deduplication ratios for each virtual tape. That’s not really useful when it comes to monitoring deduplication performance within the environment.

…but, Fibre vs Ethernet

It used to be that a compelling reason to use VTL over Boost was if you had a site where there’d been heavy investment in the fibre-channel network, but less so in the ethernet network. I.e., you might have gigabit networking for IP, but 8 gigabit for fibre-channel.

However, with NetWorker 8.1, even that reason has been addressed – from 8.1 onwards, NetWorker supports Boost devices over fibre-channel.

Boost your Backups

So there you have it – 4 sets of reasons as to why you’re better off using Boost instead of VTL with Data Domain – and a fifth bonus reason thrown in as well.

So go forth and Boost!

Avamar: Gym Built Version

 Avamar  Comments Off on Avamar: Gym Built Version
Aug 052013
 

I don’t normally write too much about Avamar. While it’s a great product with some excellent features, my background has always been in NetWorker. That being said, I am trained in Avamar and use it from time to time, and for that reason, Avamar 7 made me sit up and take notice.

Comparing Avamar v7 to previous versions is like imagining some scrawny kid who you used to hang out with at school, then bumping into him a few years later only to find out he’s spent every day since in the gym.

Simply put, Avamar v7 is Avamar on steroids for a few, key reasons:

  • No more black-out window: One of my biggest gripes with Avamar in the past has been the need for a black-out window. For a lot of organisations, a daily period where you can’t make configuration changes or run backups isn’t a big thing – particularly given the nature of source-based deduplication to result in much smaller backup windows. Yet, for some organisations, that window represented a sticking point. That’s now gone – finished, kaput. While backups are slower while garbage collection is running at the start of the maintenance window, that’s not really a problem.
  • Increased stream count: The number of streams per storage node in Avamar has jumped from 27 to 72; the number of streams in maintenance mode has increased to 20.
  • Instant-on recovery of virtual machines: Such is the integration now between Avamar and VMware that virtual machines backed up to a Data Domain may be powered on from the backup area, and if desired, VMotion’ed back into your production data storage.
  • Enhanced replication: Previously, replication in Avamar was scheduled by cron and somewhat minimal; now with replication policies, you can schedule fairly granular replication to occur.
  • IPv6 support: While I think a large number of corporate networks will remain on IPv4 for some time, firewalled behind an IPv6 internet, IPv6 support for a product like Avamar is a welcome relief for cloud service providers, regardless of whether they’re IaaS or BaaS.
  • Increased Boost support: You can now pretty much backup anything in Avamar to a Data Domain system, with just a few minor exceptions. NDMP, fileserver data, you name it – it can now go to Data Domain. That means regardless of your backup requirements, you can pick the option that best suits your environment.

Avamar 7 is a big architectural jump ahead compared to previous versions of Avamar, and I think we’ll be seeing a lot of organisations using Avamar make the transition as soon as they can schedule it.

Jul 122013
 

iStock Racing

As is typically the case, EMC and my timing has been a little out of whack. They announced their “backup to the future” event around the time that I suddenly had to move, and a few days after the event, I still haven’t been able to watch any of the coverage due to the dubious honour of having to subsist on mobile internet for a couple of weeks while I wait for ADSL to be installed.

Sigh. Clearly this is a serious problem … maybe EMC will have to employ me before NetWorker 8.2 comes out so we have a better chance of keeping our calendars in sync on big events. That way they won’t accidentally schedule a major backup release when I have to move again … 🙂

While I haven’t been able to see the “Backup to the Future” material, I had spent a chunk of time working with NetWorker 8.1 through the beta testing phase, so I can have a bit of a chat about that. So, grab whatever your favourite beverage is, pull up a chair, and let me spin you a yarn or two. (In a couple of weeks I’ll likely have a few things to say about Backup to the Future … a lot of the material out of EMC lately about accidental architecture aligns very closely to my attitudes of where companies go wrong with data protection.)

It’s not surprising that EMC’s main staff backup blog is called thebackupwindow. Windows are terms that pretty much everyone who works in backup eats, lives and breathes. (Not just backup windows of course, but recovery windows too.) You might say that Moore’s law has been a governing factor in computing. But there’s another law that, to be perfectly honest, is a pain in the proverbial for every person who is involved in backup and recovery, and for want of a better term, I’m going to call it Newton’s Third Law of Data Protection – i.e., to every action there is always an equal and opposite reaction.

The net result? Data keeps on getting bigger, and in turn backup windows for that data keeps on shrinking.

So, EMC’s primary blog being called the backup window makes perfect sense.

As does the feature set of NetWorker 8.1.

(See, I was getting to the point, even if I was walking around it a few times.)

While some of the features of NetWorker 8.1 are geared around interface changes, and others around security, the vast bulk of them are focused on meeting the demands of a shrinking backup window. Let’s take a quick look at some of those new features…

Window Work

Parallel Saveset Streams (Unix)

The bane of every backup administrator is dense filesystems, and the PSS feature is designed to help get around this. Got a Unix filesystem with tens of millions of files? Likely it’s got a good disk structure underneath it, but filesystems suck for full sequential walks. Turning on the Parallel Saveset Streams features for key Unix/Linux clients with dense filesystems will start to make a difference here – NetWorker will spawn multiple save processes to separately walk, and save data from the filesystem.

Block Level Backups (Windows)

That dense filesystem problem isn’t just limited to Unix servers, of course. Backup administrators with large Windows servers in their environments equally feel the pain, and enabling BLB functionality on Windows servers for key, large filesystems, will allow the bypass of the filesystems entirely, achieving high speed backup with file level recovery capabilities.

Storage Node Load Balancing

Sure to be a boon for big datazones, storage node load balancing will allow businesses to deploy multiple storage nodes in relatively small but dense network segments and have clients spread their backups automatically between the storage nodes, rather than having to juggle which clients should backup to where.

Optimised Deduplication Filesystem Backups for Windows

Windows 2012 Server introduced deduplication for the filesystem. NetWorker 8.1 introduces the ability to backup the deduplicated blocks. Net result? If you’ve got a 2TB filesystem which represents 800GB of deduplicated data, NetWorker gives you the option of just backing up 800GB of data rather than 2TB of data. I’m hoping, of course, that this isn’t just going to be limited to Windows deduplication filesystems … there’s a lot of ZFS users out there for instance who’ll be thinking “Um? We got there first…”

Virtual Synthetic Fulls on Data Domains

Synthetic fulls, introduced in NetWorker 8, can work wonders at reducing the required backup windows within an environment, but, creating a new synthetic full when the target was a Data Domain would result in a full rehydration of the data. Under NetWorker 8.1 though, that fabulous Boost integration continues apace, and the generation of a synthetic full is handed over to the Data Domain when it’s the operation source and target. Net result? Synthetic fulls with a Data Domain involved don’t need to rehydrate the data to generate the new full.

Boost over Fibre Channel

A long time ago in a source tree a long time ago, advanced file type devices showed a lot of promise but had some disappointments. Those disappointments were removed in NetWorker 8 with the complete re-engineering of AFTDs, but in the meantime, a lot of businesses that had deployed Data Domain systems had gone down the VTL route to try to ameliorate those backup-to-disk headaches. Unfortunately, when true backup to disk was fixed with NetWorker 8, that left those businesses in an undesirable situation: the advantages of Boost were clear, but it could only be implemented over IP, and since fibre-channel infrastructure isn’t cheap, not everyone was keen to just switch their investments across to IP. NetWorker 8.1 helps that transition. Of course, it’s not the same as making a Data Domain system fully addressable on an IP network, but it does allow the creation of Boost backup to disk devices over Fibre Channel, which means that technology transition can be phased and handled more smoothly. I suspect this will see a noticeable reduction in the number of NetWorker installs using VTLs.

Efficiency Improvements to nsrclone

Smaller than the other changes mentioned above, the nsrclone process has been improved in terms of media database fetch processes, which means it starts cloning sooner. That’s a good thing, of course.

Faster Space Reclamation on AFTD/Data Domain Systems

Unfortunately you don’t always get to control the filesystem you write to for backups. When I’m backing up to traditional disk on Linux, I pretty much always deploy AFTDs on XFS. That way, when I decide to delete 4TB of backups they delete quickly. If I was using say, ext3, I’d issue the delete command, go off, have a coffee, come back, curse at the server, go away again, have lunch, come back… well, you get the picture.

While some of the delete process is bound up in how long it takes for the OS/Filesystem to respond to a file delete command (particularly for a large file), some of that space reclamation process is bound in NetWorker’s media database operations. That part has been improved in NetWorker 8.1.

The Other Bits

I mentioned NetWorker 8.1 wasn’t all about shrinking the backup window, and there are some other features. Quickly running through them…

VMware Backup Appliance (VBA)

Virtual Machines … they really are the bane of everyone’s lives. Of course, operationally they’re great, but sometimes backing them up leaves you wishing they were all physical, still. Well, maybe not wishing, but you get the drift.

NetWorker 8.0 introduced full VADP support. NetWorker 8.1 goes one step further in working with the Virtual Backup Appliance option introduced in newer versions of ESX. This isn’t something I’ve had a chance to play with – my lab is all Parallels due to Fusion not liking my Mac Pro’s CPUs, but I imagine it’s something I’ll see deployed soon enough.

NetWorker Snapshot Management

NSM replaces the old and somewhat crotchety PowerSnap functionality. For long-term PowerSnap users who have been looking for a solid update, this will undoubtedly be a big bonus.

Recovery Comes Home

8.1 introduces a Recovery interface within NMC, where it’s belonged since NMC was first created. This seems the immediate termination of the old, legacy nwrecover interface from the Unix install of NetWorker, and it’s undoubtedly going to see the Windows recovery GUI killed off over time as well. In fact, if you want to recover from Windows block level backups, you better get used to the new recovery interface.

What I really like about this interface is that you can create a recovery session and then save it to re-run it later. A lot of administrators and operators are going to love this new interface.

But…

…I’m annoyed with Block Level Backups. It’s completely understandable that it has to be done to disk backup (i.e., AFTD or Data Domain), and that it requires client direct. Again, that’s understandable. However, if want to do block level backups to AFTDs presented from Unix/Linux servers, you’re out of luck. AFTDs must be presented from Windows servers.

know this is a relatively small limitation, but I have to be honest – I just don’t like it. I want to see it fixed in NetWorker 8.2. I’ll settle for some sort of proxy mechanism if necessary, but I really do think it should be fixed.

Then again, I do come from a long-term Unix background. So take my complaint with whatever bias you want to attribute to it.

Geronimo

So there you have it – NetWorker 8.1 is out on the starting line, revving, and ready to make your backups run faster. It’s going to be a welcome upgrade for a lot of environments, and gives us a tantalising taste of improvements that are coming to our backup windows.

 

%d bloggers like this: