Dec 232016
 

I know, I know, it’s winter up there in the Northern Hemisphere, but NetWorker 9.1 is landing and given I’m in Australia, that makes NetWorker 9.1 a Summer Fresh release. (In fact, my local pub for the start of summer started doing a pale ale infused with pineapple and jalapeños, and that’s sort of reminding me of NetWorker 9.1: fresh, light and inviting you to put your heels up and rest a while.)

NetWorker 9.1

 

NetWorker 9 was a big – no, a huge – release. It’s a switch to a more service catalogue driven approach to backups, Linux block based filesystem backups, block based application backups, deep snapshot integration and more recently in NetWorker 9.0 SP1, REST API control as well.

NetWorker 9.1 as you’d expect is a smaller jump from 9.0 than we had from 8.2 to 9.0. That being said, it’s introduced some excellent new features:

  • VMAX SmartSnap integration – the ability to backup and restore a VMAX device based on the device WWN, increasing the depth of snapshot support in NetWorker further.
  • Snapshot Alternate Location Rollback – this lets you do a snapshot rollback, but to a different set of devices.
  • Data Domain High Availability integration – Data Domain now supports high-availability on the earlier 9500 platform, in addition to the 9800, 9300 and 6800 systems. And with v9.1, NetWorker fully understands and integrates with DDHA platforms.
  • Cloud Tier Integration – NetWorker gets deep integration into the Cloud Tier functionality introduced in Data Domain OS 6.0. This lets NetWorker cloning policies control the migration of data out to the Cloud Tier, and more seamlessly integrate with the recall process.

Cloud Tier integration is more than just a tick in the box to though. Consider the module space – NetWorker Module for Microsoft Applications, for instance, doesn’t just get the option to recover data from Cloud Tier, but also perform granular recoveries from Cloud Tier – SQL table level recoveries and Exchange granular recoveries as well.


By the way, the NetWorker Usage Survey is still running – don’t forget to fill in how you’re using NetWorker! (And be in the running for a prize.)


I’ve saved the best – and biggest – feature for last, though. This is a doozy. Say goodbye to needing a EBR/VBA for VMware backups. That EBR/VBA functionality is now embedded in the NetWorker server itself, leaving you to just deploy some very lightweight proxies to handle the data transport processes, all controlled by NetWorker.

The current EBR appliance and proxies will continue to work with NetWorker 9.1, but I can’t think of anyone who’d want to upgrade to 9.1 without rapidly transitioning to the new platform. Here are just some of the advantages of the new process:

  • Less virtual infrastructure required – no EBRs
  • Virtual machines stored in raw VMDK file – no additional processing required for the backup, and this will also mean faster instant access processes, too
  • The FLR web GUI now runs on the NetWorker server itself
  • NMC can be used for FLR instead of the web GUI, making it more accessible to the NetWorker administrators if they don’t have access to the virtual machines being protected
  • Proxies support more concurrent virtual machine backups:
    • Maximum 25 concurrent hotadd operations;
    • Maximum 25 concurrent NBD operations
  • Significantly increased File Level Recovery (FLR) counts from VMware Image Level Backups (recommended 20,000 – more on that in a minute)
  • Significantly faster FLR operations.

In fact, I’m going to spend a little bit of time on FLR for this post, and step through the new NMC-based FLR process to give you an overview of the process. This is using the newly deployed NetWorker VMware Protection (NVP) system, with backup to and recovery from Data Domain virtual edition.

Fig 01: Starting a recovery in NMC

Fig 01: Starting a recovery in NMC

You start by telling NMC you want to do a virtual machine recovery and choose the vCenter server that owns the virtual machine(s) you want to recover data from.

Fig 02: Choosing the virtual machine to recover from

Fig 02: Choosing the virtual machine to recover from

There’s various options for choosing the virtual machine to recover data for – you can enter the name directly, search for it, browse the various backups that have been performed, or browse the vCenter server itself.

Fig 03: Virtual Machine selected

Fig 03: Virtual Machine selected

Once you’ve selected a virtual machine for recovery, you can click Next to choose the backup to recover from.

Fig 04: Choosing the backup to recover from

Fig 04: Choosing the backup to recover from

In this case, I only had a single backup under the new NVP system for that virtual machine, so I was able to just click Next to continue the process. At this point you get to choose the type of recovery you want to perform:

Fig 05: Choosing the type of recovery to perform

Fig 05: Choosing the type of recovery to perform

As you can see, there’s a gamut of recovery options for virtual machines within NMC. I’m focusing on the FLR options here so I chose the bottom option and clicked Next.

Fig 06: Choosing backup instance to recover from

Fig 06: Choosing backup instance to recover from

Next you get to choose the backup instance you want to recover from. If the backup has been cloned it may be that there’s topologically a better backup to recover from than the original, and choosing an alternate is as simple as scrolling through a list of clones.

At that point you get to choose where you want to recover to:

Fig 07: Choosing where to recover data to

Fig 07: Choosing where to recover data to

Next, you’ll supply appropriate credentials for the virtual machine to be able to perform the recovery and initiate a mount of the backup into the proxy server:

Fig 08: Supplying virtual machine credentials to mount the backup

Fig 08: Supplying virtual machine credentials to mount the backup

After you’ve supplied the credentials you’ll click “Start Mount” to make the specific backup available for recovery purposes, and after a few seconds that’ll result in log information such as:

Fig 09: Mounted and ready

Fig 09: Mounted and ready

When the mount is done, you’re ready to click Next and start browsing files for recovery.

Fig 10: Choosing files to recover from an image level backup

Fig 10: Choosing files to recover from an image level backup

In this example, I selected a directory with about 7,800 files in it and the marking of files for recovery took just a few seconds to complete. After which, Next to choose where to recover the data to on the selected virtual machine:

Fig 11: Choosing where to recover data to on the virtual machine

Fig 11: Choosing where to recover data to on the virtual machine

In this case I choose to recover to C:\tmp on the virtual machine. Clicking Next allows finalisation of the recovery preparation:

Fig 12: Finalising the recovery configuration

Fig 12: Finalising the recovery configuration

As you would expect with the tightly integrated controls now, FLR is fully visible within the NetWorker environment – even nsrwatch:

Fig 13: FLR in progress shown in nsrwatch

Fig 13: FLR in progress shown in nsrwatch

And finally we have a completed recovery:

Fig 14: Completed recovery

Fig 14: Completed recovery

That’s 7,918 files recovered from an image level backup in 54 seconds:

Fig 15: Recovered content

Fig 15: Recovered content

I wanted to check out the FLR capabilities a little more and decided to risk pushing the system beyond the recommendations. Instead of just recovering a single folder with 7,900 files or thereabouts, I elected to recover the entire E:\ drive on the virtual machine – comprising over 47,000 files. Here’s the results:

Fig 16: Large scale FLR results

Fig 16: Large scale FLR results

The recovered folder:

Fig 17: Recovered Content

Fig 17: Recovered Content

47,198 files, 1,488 folders, 5.01GB of data recovered as an FLR from an image level backup in just 5 minutes and 42 seconds.

If you’re using NetWorker for VMware backups, here’s the version you want to be on.

You can get it from the EMC Support page for NetWorker today.

Jul 082016
 

NetWorker 9.0 SP1 (aka “9.0.1”) was released at the end of June. I meant to blog about it pretty much as soon as it came out, but the Australian Federal election distracted me last weekend, and then on the Monday night I came down with yet another cold* which has left me floored most of the week. (In fact, I’ll be recovering for the rest of the weekend.)

But, the show must go on, as they say, and with the dawning of Friday the aches and pains, coughing and sneezing had subsided enough that I could sit at my desk for a while and upgrade my home lab to NetWorker 9.0.1. (Some people make chicken soup while they’re sick, I upgrade NetWorker servers. There you go.)

So it’s time to talk a little bit about the latest member of the NetWorker family.

Upgrade

First thing I have to call out is that the excellent enhancements done to nsrwatch in NetWorker 8.2 SP3 have been rolled forward into NetWorker 9.0 SP1 – though it looks like it didn’t get a mention in the release notes. So, in all its glory:

nsrwatch NSR9

I love this version of nsrwatch. There’s so much more functionality to it. If you’re a command line junkie like me you’ll love it too; and if you’re not, you should give it a go from time to time to give your mouse a break.

But NetWorker 9.0 SP1 is not just about nsrwatch so I’ll continue. I’ll be following the flow of the release notes (which you can access here for any further clarification), which will hopefully serve as a good reminder of what I need to cover given my slightly illness-addled thoughts.

Data Domain Enhancements

9.0 SP1 includes support for various DDOS 5.7 features including Boost over Fibre Channel for Solaris 10 and 11, as well as the DDOS 5.7 high availability mode. The Boost over FC for Solaris 10 and 11 is a welcome feature for organisations with spare fibre-channel networking, but there’s another performance enhancement that’ll be a boon for organisations with 10Gbit networking in particular, and that’s AMS.

AMS, or Automated Multi-Streaming allows NetWorker/Data Domain to automatically segment larger savesets (>3.5GB) to be copied between two Data Domains via Clone Controlled Replication (CCR) into ~2GB chunks, speeding up the processing. Consider for instance a ‘typical’ clone operation for a 10GB saveset:

Cloning Conventional

Now that cloning is still going to be efficient – only the unique segments will be sent between the two Data Domains, but the entire file has to be processed sequentially to work out what does or does not need to be sent. This data walk will take time, but is effectively single-streamed for the individual saveset. Yet we know Data Domain systems can handle potentially large numbers of simultaneous streams – so why not boost our stream utilisation and automatically speed up the processing of that replication?

AMS Enhanced Cloning

With automated multi-streaming enabled, that 10GB saveset will be split into up to 5 chunks (I’ve kept it simple, remember that’s an approximate size split), with each chunk concurrently processed for deduplicated replication, with the replicated components stitched together on the destination Data Domain.

(According to the NetWorker/Data Domain integration guide this feature does require 10Gbit connectivity.)

Keep in mind the AMS feature is not automatically enabled; currently it has to be turned on using an nsrcloneconfig file created in the nsr/debug directory on the NetWorker server. The details for this are covered in the NetWorker/Data Domain integration guide, updated for 9.0.1, but the example file given in the guide looks as follows:

racdd098:/nsr/debug # cat nsrcloneconfig
max_total_dd_streams=256
ams_enabled=yes
ams_slice_size_factor=31
ams_preferred_slice_count=0
ams_min_concurrent_slice_count=1
ams_max_concurrent_slice_count=20
max_threads_per_client=256
ams_force_multithreaded=yes

NDMP

Network Data Management Protocol, if you’re not familiar with it, is used to backup NAS appliances where a traditional agent can’t be installed. NetWorker 9.0 SP1 adds support for Isilon multi-streaming, speeding up backups from industry-leading scale-out NAS storage. Verbose logging has been added for file recoveries, and there’s been support added for token based backups on Hitachi NAS systems. (Token based backups or TBB is where file selection can be done based on previously established backup tokens – effectively allowing for faster incremental backups, if the array supports it. NetWorker already supports TBB on a variety of other platforms.)

Storage Array Snapshot Management

NetWorker Snapshot Management (NSM) has been expanded to also support ProtectPoint for VMAX3, ProtectPoint for RecoverPoint and XtremIO, enhancing again NetWorker’s ability to fold array snapshot and high speed database protection snapshots into data protection policies.

CloudBoost Enhancements

NetWorker with CloudBoost now supports backup as well as cloning, with a current emphasis on in-Amazon backup functionality. CloudBoost v2.1 appliances can also work with Linux clients for ClientDirect, distributing the backup process more smoothly in those scenarios.

This functionality will be particularly useful for those environments where part of the infrastructure workload is already sitting in Amazon’s cloud services. A NetWorker server can now be deployed alongside the infrastructure, with a CloudBoost appliance stood up as well, and clients can backup via CloudBoost into Amazon S3 storage, thus achieving the sort of data protection enterprises need without having to consume expensive Amazon block storage.

REST API

Remember that post I made a few months ago about automation? At the time I wrote the post, someone contacted me privately and suggested to me that NetWorker automation is a bit of a red-herring without REST API support. Well, having spent the last 20 years automating NetWorker I’d happily argue that’s an incorrect assumption in the first place, but part of the reason I wrote that post was because I knew the NetWorker REST API was on the way.

With NetWorker 9.0 SP1, businesses can now work on bundling NetWorker services into their DevOps service portals written in and relying on REST APIs.

With the API comes both an API getting started guide and an API command reference, too. So if you’re keen to get NetWorker included into your service portals, there’s plenty of documentation available to assist. The guides include details not only on what you can do, but how to connect to and authenticate with the NetWorker server. A very basic example I’ve grabbed from the getting started guide is as follows:

REST API Example

The REST API getting started guide is replete with these sorts of examples, by the way. It gives the conventional way of scripting a particular action or function, then provides the means of invoking the same action/function via the REST API.

Other Enhancements

VMware vVol support has been added for VMAX3 and Unity arrays as well.

Finally, the jobs database has been updated – be sure to check the release notes and understand the implications prior to upgrading.

Wrapping Up/The Update

As I said at the start of this article, I upgraded by lab server this morning from NetWorker 9 (actually, 9.0.0.7) to NetWorker 9.0.1 before I started blogging.

Since I deployed VBA into my environment, that also included running the EBR Upgrade for VBA as well, and if you’re using VBA for VMware backups in your environment as an increasing number of NetWorker environments are, then you should make sure you plan to upgrade your VBA systems and redeploy any external proxies as part of the upgrade process.

The upgrade process was relatively smooth – as you can imagine the longest part of the upgrade process was actually the VBA upgrade package processing, so you will want to make sure you allocate enough time for your upgrade to ensure the VBA systems are upgraded and new external proxies deployed prior to your next backup windows starting.

This marks the first significant NetWorker 9.x update since 9.0 was released last year, but sets us on the path for some fantastic coming features. If you’ve been holding off on NetWorker 9 until SP1 came out, now’s the time to upgrade. Otherwise, be sure to review the release notes, test in your labs if necessary, and start your upgrade engines. (If you need a refresher on the rest of NetWorker 9, check out my original post on it here.)

____
* That’s four this season. I’m not amused at all at the moment.

Apr 222015
 

We live in a world of activities. Particularly in IT, almost everything we do can at some point be summarised as being an activity. Activities we perform typically fall into one of three organisational types, viz.:

  • Parallel – Activities that can be performed simultaneously
  • Sequential – Activities that must be performed in a particular order
  • Standalone – Activities that can be performed in any order

Additionally, we can say that most activities we perform are either blocking or non-blocking, at least in some context. Setting up a new client in NetWorker for instance is not normally considered to be a blocking activity, unless of course we consider it in the context of a subsequent backup of that same client.

Hourglass

One thing we can definitely consider to be a blocking activity in NetWorker is a server upgrade. These days you have to be a little more conscientious about server upgrades. Whereas before it was not uncommon to encounter environments with multiple storage nodes of varying NetWorker versions (despite whether this was a good idea or not), the upgrade process now requires you to ensure all storage nodes are upgraded either before, or at least at the same time as the NetWorker server. Additionally, if you’re using a dedicated NetWorker Management Console server, you should also upgrade that before the NetWorker server so that your NMC interface is capable of addressing any new functionality introduced in NetWorker*.

Each of those upgrades (NMC, storage nodes and NetWorker server) take a particular amount of time. If you’re using standard NetWorker authentication, the package upgrade or removal/reinstall process will take mere minutes. If your authentication is integrated with an Active Directory or LDAP service, that’ll be a little bit more complicated.

But (and there’s always a but), the process of either upgrading the NetWorker binaries on the server (or uninstalling and re-installing, depending on your operating system) is not the be-all and end-all of the NetWorker upgrade process. There are a sequence of activities you need to perform prior to the upgrade as part of consistency checking and validation that you really shouldn’t ever skip. Depending on the number of clients, backup volumes or savesets you have, that might take some time to complete**.

Looking at the NetWorker 8.2 upgrade guide, the recommended activities you should complete before performing a NetWorker server software upgrade are as follows:

  • nsrim -X
  • nsrck -m
  • nsrck -L6
  • nsrls -m
  • nsrls
  • nsrports
  • savegrp -l full -O groupName ***
  • mminfo -B

Of those commands, the ones that will potentially take the most time to execute are shown in bold, though the timing difference between just those three commands can be quite substantial. Consider the nsrck -L6 in particular: that’s a complete in-place index rebuild for all clients. If your indices are very large, or your NetWorker index filesystem storage incapable of providing sufficient IOPS (or a combination of both), the run-time for that activity may be lengthy. Equally, if your indices are large, a savegrp -l full -O to force a full index backup**** may also take quite a while to backup, depending on your backup destination.

As an example, on a virtual lab server significantly under the minimum performance specifications for a NetWorker server (not to mention virtualised in vSphere which was in turn virtualised in VMware Fusion), I built up an index for a client of approximately 10GB after first creating then repeatedly backing up a filesystem with approximately 3,000,000 files in it. (mminfo reports an nfiles of 3,158,114). After those backups were generated, nsrls told me:

# nsrls centaur
/nsr/index/centaur: 88957985 records requiring 10 GB
/nsr/index/centaur is currently 100% utilized

Moving on to compare the run-times of an nsrck -L1, -L3 and -L6 yielded the following:

# date; nsrck -L1 centaur; date
Wed Apr 22 17:19:51 AEST 2015
nsrck: checking index for 'centaur'
nsrck: /nsr/index/centaur contains 88957985 records occupying 10 GB
nsrck: Completed checking 1 client(s)
Wed Apr 22 17:19:51 AEST 2015
# date; nsrck -L3 centaur; date
Wed Apr 22 17:19:51 AEST 2015
nsrck: checking index for 'centaur'
nsrck: /nsr/index/centaur contains 88957985 records occupying 10 GB
nsrck: Completed checking 1 client(s)
Wed Apr 22 17:19:52 AEST 2015
# date; nsrck -L6 centaur; date
Wed Apr 22 17:19:52 AEST 2015
nsrck: checking index for 'centaur'
nsrck: /nsr/index/centaur contains 88957985 records occupying 10 GB
nsrck: Completed checking 1 client(s)
Wed Apr 22 17:26:26 AEST 2015

Now, these timings are not under any circumstances meant to be typical of how long it might take to perform an nsrck -L6 against a client index of a similar size on a real NetWorker server. But the fact that there’s a jump in the time taken to execute an nsrck -L3 vs an nsrck -L6 should serve to highlight my point: it’s a nontrivial operation depending on the size of the indices (as are nsrim -X and the savegrp -O), and you must know how long these operations are going to take in your environment.

When planning NetWorker upgrades, I think it’s quite important to keep in mind the upgrade process is blocking: once you’ve started it, you really need to complete it before you can do any new backups or recoveries. (While some of those tasks above can be interrupted, you may equally find that an interruption should be followed by starting afresh.) So it becomes very important to know – in advance of when you’re actually going to perform the upgrade – just how long the various pre-upgrade steps are going to take. If not, you may end up in the situation of watching your change window or allowed outage time rapidly shrinking while an operation like nsrck -L6 is still running.

It may be that I’ve made the upgrade process sound a little daunting. In actual fact, it’s not: those of us who have been using NetWorker for close to two decades will recall just how pugnaciously problematic NetWorker v4 and v5 upgrades were with the older database formats*****. However, like all upgrades for critical infrastructure, NetWorker upgrades are going to be optimally hassle-free when you’re well prepared for the activities involved and the amount of time each activity will take.

So, if you’re planning on doing a NetWorker upgrade, make sure you plan the timings for the pre-upgrade maintenance steps … it’ll allow you to accurately predict the amount of time you need and the amount of effort involved.


* Finally of course, if you’re using VBA, you’ll possibly need to upgrade your EBR appliance and proxies, too.

** On the plus side, NetWorker is quite efficient at those maintenance operations compared to a lot of other backup products.

*** The guide doesn’t mention including ‘-l full’. However, if doing anything approaching a major version upgrade (e.g., 8.1 to 8.2), I believe you should do the index backup as a full one.

**** To execute this successfully, you optimally should have a group defined with all clients in it. Otherwise you’ll have to carefully run multiple groups until you’ve captured all clients.

***** For servers, I only go back as far as v4, so I can’t say what upgrades were like with v3 or lower.

What’s new in 8.2?

 EMC, NetWorker  Comments Off on What’s new in 8.2?
Jun 302014
 

NetWorker 8.2 entered Directed Availability (DA) status a couple of weeks ago. Between finishing up one job and looking for a new one, I’d been a bit too busy to blog about 8.2 until now, so here goes…

what's new in 8.2

First and foremost, NetWorker 8.2 brings some additional functionality to VBA. VBA was introduced as the new backup process in NetWorker 8.1. Closely integrating Avamar backup technologies, VBA leverages a special, embedded virtual Avamar node to achieve high performance backup and recovery. Not only can policies be defined in NMC for VBA can be assigned by a VMware administrator in the vSphere Web Client,  … so too can image level backup and recovery operations be executed there. Of course, regularly scheduled backups are still controlled by NetWorker.

That was the lay of the land in 8.1 – 8.2 reintroduces some of the much-loved VADP functionality, allowing for a graphical visualisation map of the virtual environment from within NMC.

Continuing that Avamar/VMware integration, NetWorker 8.2 also gets something that Avamar 7 administrators have had for a while – instant-on recoveries when backups are performed to Data Domain. There’s also an emergency restore option to pull a VM back to an ESX host even if vCenter is unavailable, and greater granularity of virtual machine backups – individual VMDK files can be backed up and restored if necessary. For those environments where VMware administrators aren’t meant to be starting backups outside of the policy schedules, there’s also the option now to turn off VBA Adhoc Backups in NMC.

Moving on from VMware, there’s some fantastic snapshot functionality in NetWorker 8.2. This is something I’ve not yet had a chance to play around with, but by all accounts, it’s off to a promising start and will continue to get deeper integration with NetWorker over time. Currently, NetWorker supports integrating with snapshot technologies from Isilon, VNX, VNX2, VNX2e and NetApp, though the level of integration depends on what is available from each array. This new functionality is called NSM for NAS (NetWorker Snapshot Management).

The NSM integration allows NAS hosts to be integrated as clients within NetWorker for policy management, whilst still working from the traditional “black box” scenario of NAS systems not getting custom agents installed. There’s a long list of functionality, including:

  • Snapshot discovery:
    • Finding snapshots taken on the NAS outside of NetWorker’s control (either before integration, or by other processes)
    • Facilitate roll-over and recovery from those snapshots (deleting isn’t available)
    • Available as a scheduled task or via manual execution
  • Snapshot operations:
    • Create snapshots
    • Replication snapshots
    • Move snapshots out to other storage (Boost, tape etc) using NDMP protocols
    • Lifecycle management of snapshots and replicas via retention policies
    • Recover from snapshots

Data Domain Boost integration gets a … well, boost, with support for Data Domain’s secure multi-tenancy. This support scaling for large systems designed for service providers, with up to 512 Boost devices supported per secure storage unit on the Data Domain. While previously there was a requirement for a single Data Domain Boost user account across all Data Domain devices, this now allows for better tightening of access.

One of my gripes with BBB (Block Based Backup) in NetWorker 8.1 has been addressed in 8.2 – if you’re stuck using ADV_FILE devices rather than Data Domain, you can now perform BBB even if the storage node being written to is not Windows. Another time-saving option that was introduced in 8.1, Parallel Save Stream (PSS), has been extended to support Windows systems, and has also been updated to support Synthetic and Virtual Synthetic Fulls. in 8.1 it had only supported Unix/Linux, and only in traditional backup models.

Continuing the trend towards storage nodes being seen as a fluid rather than locked resource mapping, there’s now an autoselect storage node option, which if enabled allows NetWorker to select the storage node itself during backup and recovery operations. If this is enabled, it will override any storage node preferences assigned to individual clients, and NetWorker looks for local storage nodes wherever possible.

There’s a few things that have left NetWorker in 8.2, which are understandable: Support for Windows XP, Windows 2003 and the Change Journal Manager. If you still to protect Windows XP or Windows Server 2003, be sure to keep your installers for 8.1. and lower client software around.

There’s some documentation updates in NetWorker 8.2 as well:

  • Server Disaster Recovery and Availability Best Practices – This describes the disaster recovery process for the NetWorker server, including best practices for ensuring you’re prepared for a disaster recovery situation.
  • Snapshot Management for NAS Devices Integration – This documents the aforementioned NSM for NAS new feature of NetWorker.
  • Upgrading to NetWorker 8.2 from a Previous Release – This covers off in fairly comprehensive detail how you can upgrade your NetWorker environment to 8.2.

In years gone by I’ve found that documentation updates have been a lagging component of NetWorker, but that’s long since disappeared. With each new version of NetWorker now we’re seeing either entirely new documents, or substantially enhanced documentation (or both). This speaks volumes of the commitment EMC has to NetWorker.

Apr 252014
 

I recently encountered one of those has-to-be-environmental errors that turned out to be a NetWorker issue. You know the one: the situation being explained is so straight-forward that there’s no chance NetWorker could be the actual culprit.

Then you investigate.

Then you investigate a little more.

Then you blink slowly and realise it really is a NetWorker error.

The case in point was a customer with a VMware environment who reported that simply trying to upgrade NetWorker would result in servers irretrievably blue-screening to the point where it was necessary to recover from an image-level backup. That there were other odd issues happening in the environment was obvious: the NetWorker server would at times lose access to its own nsrdb directory, locked out with permissions errors. Virtual machines reported corrupt sectors on disk, and then there was the blue-screening upgrade.

But after testing against clean images in the customer environment and then setting up a test environment in my lab, this was something wrong with NetWorker.

Here’s the scenario:

  • Windows 2008 R2 server, patched to current patch levels.
  • Install NetWorker 8.1.0.0 (i.e., 8.1 vanilla release).
  • Install NMM 3.0.0 Build 282.
  • (Use the machine for a while – or don’t)
  • Uninstall NMM 3.0.0 Build 282 in order to install 3.0.1 Build 280.
  • (Reboot)
  • While you’re at it, Upgrade NetWorker to 8.1.1.[2,3,4].
  • Install NMM 3.0.1 Build 280, which triggers a reboot
  • Server bluescreens – STOP: 0x0000007a error.

In actual fact, you don’t even need to install NMM 3.0.1 – the problem is caused by the upgrade of 8.1.0.0 to a newer 8.1.x.y release.

My guess (and that of a colleague, too), was that this has something to do with the Block Level Backup option in 8.1. In the original 8.1 release, enabling the BLB option triggered a reboot requirement. Under newer releases this doesn’t seem to be the case.

The solution turned out to be reasonably straight forward once the culprit was identified: uninstall 8.1.0.0 first (leaving metadata in place), and then install the newer 8.1.x.y release.

The problem seems to be limited specifically to those hosts running the original 8.1 release … subsequent upgrades of the client software don’t trigger the issue.

If you installed the base 8.1 release on your Windows 2008 R2 servers … watch out for this one.

Jan 092012
 

Upgrading NetWorker

So a new version of NetWorker has come out, or is coming out, and it’s been decided that you’re going to upgrade, but you want a few tips for making that upgrade as painless as possible. Here’s my 5 rules for upgrading NetWorker:

  1. Read the release notes. If you’re not going to read the release notes, you are better off staying on your current version, no matter what issues you’re having. I can’t stress enough the importance of reading the release notes and having a thorough grasp of:
    • What has changed?
    • What are the known issues with the current release?
    • What were the resolved issues between the current release and the release you’re currently running?
  2. Do a bootstrap and index backup if upgrading between major or minor releases. If going between service packs on the same release, you can skip the index backup so long as your backups have been successful lately, but ensure you still do a bootstrap backup.
  3. Unload all tapes (physical or virtual) in jukeboxes before the upgrade. You’ll see why shortly.
  4. Upgrade in this order:
    • Storage node(s) on the day of the upgrade, before the NetWorker server
    • Server on the day of the upgrade, after the storage node(s)
    • Client(s) later, at suitable times
  5. After the upgrade but before the NetWorker services are restarted on the storage node(s) and server, delete the nsr/tmp directory on those hosts.

Obviously standard caveats, such as following any additional instructions in the release notes or upgrade notes should of course be followed, but sticking to the above rules as well can save a lot of hassle over time. I’ve noticed over the years that a odd, random problems following upgrades can be solved by clearing the nsr/tmp directory on the server and storage nodes. If there’s no tapes in the jukeboxes when the services first start after the upgrade, there’s less futzing for NetWorker to take care of before it’s fully up and running, too.

Jul 072010
 

Clients

The Question

It’s usually the case that the biggest part of a NetWorker environment – in terms of resources that are configured, and software deployed, are the clients themselves. When sites look at upgrading their NetWorker environments though, the normal procedure is to upgrade the server and any storage nodes as the first step, then plan to upgrade clients on an “as needed” or “when we get around to it” basis.

This prompted a customer to recently ask me to write a blog article about this topic (thanks, Robert!) Specifically, Robert’s question was – why should I upgrade my clients?

Having worked with several of my clients now for close to a decade, I’m familiar with the scenario: the servers and storage nodes will be at appropriately supported versions of the NetWorker software, but clients are trailing behind, and before you know it your versions may stretch out like a long tail behind your backup server and storage nodes:

Client versionsSo it begs the question – when NetWorker is so good at supporting older client versions, what’s the rush in upgrading old clients? This is a question where an answer of “…because…?” isn’t sufficient, so perhaps first it’s worthwhile considering some common arguments for not upgrading the clients:

  • If it’s not broken, don’t fix it.
  • We had some problems with version X, it’s stable on X+n, so keep it that way. (A variant of the above.)
  • It’s working, so it’s a low priority task.
  • Admins are too busy fire fighting to do unnecessary upgrades.
  • Change control is too tedious.
  • This is the last supported version for this <old> operating system.

The Answer

The generic answer

Each of the above reasons, in their own right, can be a perfectly valid reason. Temporarily stepping away from backup software and looking at say, operating systems, here’s some example reasons why we eventually choose to upgrade operating systems:

  • We explicitly need the new features.
  • New applications require the new features.
  • Poor support on old OS for new hardware (and vice versa).
  • More efficient.
  • Faster.
  • More secure.

We can evaluate a whole host of  reasons, but we can actually boil any upgrade rationale down to one of the following three generic reasons:

  1. Risk – The risk in not upgrading overrides the cost of upgrading. Two common risks are security or reliability.
  2. Features – The currently installed version lacks features that are both available and required in a newer version available.
  3. Support – The currently installed version is either out of support, or is scheduled to no longer be supported as of a known, unacceptably close date.

Note – regarding features: To be a valid upgrade reason, it should be both available and required, not one or the other – and yes, sometimes upgrades are done based on features being required without first checking if they’re available!

When we boil down upgrade reasons to just three generic terms, risk, features and support, it becomes easier to justify either:

  • Having an active programme in place to keep clients up to date or
  • Periodically updating clients.

So going back to NetWorker clients, we can evaluate what sort of reasons in each of the generic categories might prompt an upgrade; I’m going to go backwards through the previous list.

The NetWorker answer

Support

To me, unsupported = broken. So, “if it’s not broken, don’t fix it” stops being a valid reason at the point where client software installed is no longer supported. So for sites that have v7.3.x and lower clients laying around – or come October 1 2010, v7.4.x and lower clients around, you should either:

  • Upgrade to a supported version or
  • Upgrade to the last supported version that is compatible with the client (for very old clients/applications).

If a client is on an unsupported version of the software and it can be upgraded to a support version, leaving it on that unsupported version can introduce unnecessary risk in the environment. While a current version of NetWorker will more than likely keep communicating with an older version of NetWorker, that doesn’t mean that issues can’t happen, and if they do, you want to be able to resolve the issue as quickly as possible. By having a supported version of the client installed, you can considerably streamline the resolution process.

Features

We have a tendency to focus on the backup server (and to a lesser degree), storage node, when looking for features support. For instance, we may want disk backups to be able to do X, or NDMP backups to be able to do Y, and so on. However, feature support isn’t enhanced only at the server layer. In actual fact, a lot of feature support comes from the client software. For instance:

  • If you’re working with Solaris 10 clients that are deployed in non-global domains, having up-to-date client software ensures that you maximise your support of that configuration;
  • If you’re looking at upgrading a host from Windows 2003 to Windows 2008 R2, you’re likely going to need to upgrade the NetWorker client – you need a newer client instance that has more up to date support for the newer operating systems;
  • If you’re wanting to eliminate no-longer-needed licenses within your backup environment, and are looking at getting rid of those ClientPak licenses, you’ll need to make sure that the clients themselves support the removal of the licenses;
  • If you want to be able to do VSS filesystem backups but not have to buy VSS licenses, you’ll need to have a version of the NetWorker client that supports this option;
  • If you want to replace your Oracle 9 database with Oracle 11, you may find yourself needing to upgrade the database module. This in turn may necessitate an upgrade of the client software to support the newer module, too.

Suffice it to say, feature support can be just as important at the client level as it is at the backup server level. In this regard, the release notes will always be an excellent reference – if you’re not sure whether you need to upgrade, check to see what new functionality comes into play on the latest versions of the software.

Risk

The final reason to upgrade is risk – risk that there is a bug or a security issue in the currently installed version of the software that may be resolved in a newer version. Like “Features”, above, your best bet for determining the risk of not upgrading is by referring to the release notes for newer versions of the software. Read the “fixed issues” notes very carefully; it could be that intermittent issues you haven’t yet found time to investigate – or that you have been actively trying to resolve – are actually resolved in a newer version of the software. While we often look at fixed issues in NetWorker release notes for the server and storage node, they can be equally applicable at the client level, too.

When should clients be upgraded?

Once we’ve determined that we can decide to upgrade clients on the basis of either support, features or risk, we must next ask ourselves the question – when should the clients be upgraded? There’s a sister question to this too – how frequently should clients be upgraded?

I’m not going to suggest that your backup server and all its clients should be kept in absolute version lock-step the entire time. If you have the processes, personnel and time to do this, then by all means go ahead – but it isn’t something that you should obsessively worry about. Instead, I’ll offer some generic suggestions; to do this though I’ll refer to major and significant version numbers. Consider say, NetWorker 7.5 SP2; I’d consider the major version number to be 7, the significant version number to be 5, and the service pack to be 2.

  • Aim to keep all clients that support it on at least the same major version number as the backup server;
  • Where time permits try to get clients on the same (or higher*) major+significant version number as the backup server – but as a general rule, ensure that the clients are at least on a supported major+significant version number.
  • Consider getting clients onto the same major+significant+service pack version as the backup server where there are support, risk or feature reasons, i.e.:
    • Where there are new features in the service pack you need, or,
    • Where there are risks in remaining at the current version, or,
    • Where there are support reasons for updating. (E.g., patch available for new SP that would need to be back-ported to your existing version).

You may think that all these answers are a bit vague – and by necessity, they are, since the issues, needs and processes at each site will govern exactly how and why upgrades are done.


* Yes, or higher. Such as for instance, sites that have been running a NetWorker 7.4.x server, but need to run a 7.5 SP2 client for Windows 2008 R2 systems, etc.

Jan 122010
 

On the NetWorker Mailing List, I still frequently see a lot of posts from people who are having various problems with their NetWorker 7.2.x servers.

It’s time to move away from 7.2. I know, it was the last version before nsrjobd; the move to nsrjobd in 7.3, then raw daemon logs in 7.4 can both be a bit shocking, but 7.2 is now critically old and critically out of support. Equally, there’s still a lot of people out there running 7.3 releases of NetWorker. That, too, exited support some time ago, and it’s time to move on from it too.

I’ll agree that within backup, there is a strong logic to the statement “if it ain’t broke, don’t fix it”, but, you have to weigh up that against the simple fact that 7.2.x releases in particular are very old, and 7.3.x releases are fairly aged as well.

Since I’ve been watching more and more of Top Gear, I’ll use a car analogy. Let’s say you’ve got a brand new, top of the line Ferrari. When it needs servicing, do you take it to the official Ferrari shop that provides a 100% warranty on all repairs and whose repairs keep the original vehicle warranty intact, or do you take it to Bill & Joes Motor Fixits ‘R’ Us, who not only might leave you with a car in a worse condition than when you drove it in, but who aren’t certified by Ferrari and thus lose you your new car warranty?

Continuing to backup your environment with a backup product which is long out of support is like outsourcing to Bill & Joes Motor Fixits ‘R’ Us IT Service.

I’ll be the first to admit that even on simple updates you can run into a few hassles. Particularly as you move up the NetWorker version chain you’ll find changes to authentication and name resolution requirements alone that may necessitate some additional work around the time of the update. If your clients are old you’ll also be needing to plan an update for them as soon as possible too, and in some cases, you may find yourself definitely having to update clients if there turns out to be some particularly odd issue.

But I’ll be honest: that little bit of up-front pain is much, much better than hitting a critical backup or recovery problem that can’t be solved without upgrading (or worse, can’t be solved due to incompatibilities between ancient NetWorker versions and modern operating system versions). Planning and implementing a controlled upgrade, even if it does end up having a few hassles, is infinitely better than doing an emergency upgrade without any planning to facilitate a recovery or a backup that has to be done.

Wait! Don’t apply that service pack!

 NetWorker  Comments Off on Wait! Don’t apply that service pack!
Nov 062009
 

Recently we’re seeing a lot of people upgrading to Windows 2008 SP2, without first checking to see that release notes and compatibility guides state that NetWorker doesn’t yet support this release.

I fully agree that this represents monumental slowness on the part of EMC … there’s absolutely no excuse – none whatsoever – for them not to be on the relevant developer programmes and partner programmes for all the supported operating systems so they get access to the new releases before they come out and then make sure there’s either hot-fixes or cumulative updates available to support new operating systems.

They don’t have to be on the same day, but it’s foolish short-sightedness at best that they don’t support a new OS release within say, 2 weeks of it hitting the general public, given that partner and developer programmes will give access to it for months in advance of that point.

Now, back to my original point – if you’re planning on rolling out a new service pack to an operating system, please take a few minutes to read the release notes or software compatibility guides – or ask your support team to fill you in, and if it’s not supported, roll out to a test client first so you can confirm the impact to your backup environment.

It takes two to tango – EMC needs to improve their response to new operating systems and new major updates to operating systems, but it’s equally important for people to remember to check these things before they upgrade, not after they get the first backup (or worse! recovery) error.

(Do I do these checks all the time? No – only in lab environments. It’s my job to identify bugs and issues before my customers find them as much as possible.)

Basics – Updates vs Upgrades

 Basics, Features, NetWorker  Comments Off on Basics – Updates vs Upgrades
Aug 182009
 

After 13+ years of using NetWorker, I still tend to interchangeably use the terms ‘upgrade’ and ‘update’ (or to be more precise, mainly use the term ‘upgrade’).

However, there is, and always has been, a difference between the two terms in NetWorker nomenclature, and it’s useful knowing it in case you’re being asked to qualify your environment to a support person.

Here’s what they mean for NetWorker:

  • An upgrade is transitioning from one licensed feature set to a more advanced licensed feature set. For example, you might upgrade from NetWorker, Network Edition to NetWorker, Power Edition. Previously (when tiered licensing was still used for Windows modules), you might upgrade from say, Exchange Module Tier 1 to Exchange Module Tier 2. Alternatively, you can buy an upgrade to slot capacity for an Autochanger license (e.g., upgrading from a 1-64 slot license to a 1-128 slot license).
  • An update is where you change the version of a NetWorker product. E.g., you update from NetWorker 7.4.4 to NetWorker 7.4.5, or from NetWorker 7.3 to NetWorker 7.5.1. You would equally update from Oracle Module 4.5 to Oracle Module 5.

Since in both support and data protection it’s useful to avoid ambiguities, understanding the difference between these two terms can be important.

%d bloggers like this: