Nov 052015
 

While NetWorker 9 went DA at the end of September and is seeing healthy uptake around the world, NetWorker 8.2 is still getting updates.

Released a couple of weeks ago, NetWorker 8.2 SP2 includes a slew of changes. While undoubtedly NetWorker 9 will be seeing the majority of new-feature development henceforth, that’s not to say that NetWorker 8.2 won’t get refinements as required and planned.

Road to Recovery

Some of the changes and updates to NetWorker with the 8.2 service pack 2 release include:

  • Support for JRE 1.8/Java 8 – You can now run the NMC interface using Java 8
  • Reduced name resolution checking – NetWorker still requires name resolution (of some form), as you’d hope to see in an enterprise product, but there’s been a refinement to the times NetWorker will perform name resolution. Thus, if your DNS service is particularly slow or experiencing faults, NetWorker should not be as impacted as before*.
  • Automated checks:
    • Peer Information
    • Storage nodes
    • Usergroup hosts
  • Maximum device limit can be increased to 1024. If you’ve got a big datazone that’s running at or near 750 devices, you can now increase the maximum device count to 1024.
  • DDBoost Library upgrade – NetWorker 8.2 SP2 uses libDDBoost 3.0.3.0.

It’s the automated checks I want to spend a little time on. As a NetWorker admin or implementation consultant I would have killed for these sorts of tests; and indeed, I even wrote some scripts which did some similar things. If you’ve not used them before, you should have a look at some of the tests previously added (here and here).

The new options though continue to enhance that automated checking and will be a boon for any NetWorker administrator.

Storage Node Checking

The first new option I’d like to show you is the option to automatically check the status of all the storage nodes in your environment. This can be performed by executing the command:

# nsradmin -C "NSR Storage Node"

The output from this will obviously be dependent on your environment. In my lab I get output such as the following:

nsradmin -C "NSR Storage Node"

So for each storage node, you get name resolution checking for the storage node (forward and reverse), as well as a dump of the devices attached to the storage node. If you have an environment that has a mesh of Data Domain device mounts, or dynamic drive sharing/library sharing, this will make getting a very quick overview of devices a piece of cake.

Usergroup Host Checking

As a means of more tightly checking security options, you can now query the hosts referenced in NetWorker Usergroups and determine whether they can be correctly resolved and reached. This command is run as follows:

# nsradmin -C "NSR usergroup"

And the output (for my system) looks a bit like the following:

nsradmin -C "NSR usergroup"

(Note that it will check the referenced hosts for each user specified for that host.)

NSR Peer Information

NetWorker uses peer information – generated certificates – to validate that a host which connects to the NetWorker server saying it’s client-X really can prove it’s client-X. (It’s like the difference between asking someone their name and asking them for their driver’s license.) This prevents hosts from attaching to your network, impersonating one of your servers, and then recovering data for that server.

Sometimes if things radically change on a client that peer information may get outdated and require refreshing. (Fixing NSR Peer Information Errors remains the most accessed post on my blog.) Now you can use the automated checking routine to have a NetWorker host check all the peer certificates in its local client resource database.

The command in this case has to reference the client daemon, and so becomes:

# nsradmin -p nsrexec -s clientName -C "NSR peer information"

If you run it from the NetWorker server, for the NetWorker server, the clientName in the above becomes the server name, since you’re checking the client resource for the server. Note if you’re feeling particularly old-school (like I do all the time with NetWorker), you can replace nsrexec with 390113 in the above as well. This is actually also a good way of checking client connectivity, since we verify any local client certificates by comparing the locally cached certificate against the certificated stored on each client. (Given I’m running this on a lab server, it’s reasonable to see some timeouts and errors.)

For my lab, the results look like the following:

http://nsrd.info/blog/2009/02/23/basics-fixing-nsr-peer-information-errors/

NSR Peer Information Check 2 of 2

If you intend to wait a while (or until it hits GA) before you upgrade to NetWorker 9, I’d heartily recommend upgrading to NetWorker 8.2 SP2 if for no other reason than the incredibly useful automated checks that have been introduced.


* If your DNS/name resolution is improperly configured or faulty, I’d suggest it should be dealt with quickly.

Updated checks in nsradmin

 NetWorker, Scripting, Support  Comments Off on Updated checks in nsradmin
Aug 192015
 

A while ago EMC engineering updated the venerable nsradmin utility to include automated checking options, with an initial focus on checks for NetWorker clients. As a NetWorker administrator I would have crawled over hot coals for this functionality, and as an integrator I found myself writing Perl scripts from company to company to do similar checks.

As of NetWorker 8.2.1.6, the checks have been expanded a little, with a few new enhancements:

  • Client check now performs Client/Server time synchronisation checking
  • Client check now does a ping test against configured Data Domains
  • Storage node check has been added.

I currently don’t have a Data Domain in my lab, but I’ll show you want the time synchronisation check looks like at least. As always, for client checks in nsradmin, the command sequence is:

# nsradmin -C query

Where query is a valid NetWorker query targeting clients. In my case in my lab, I used:

# nsradmin -C "NSR client"

The output from this included:

Client Check - Time synchronisation

In the example output, I’ve highlighted the new time synchronisation check. With this included, the nsradmin client check utility expands yet again in usefulness.

Moving on to the Storage Node option, we can now have NetWorker verify connectivity list the devices associated with each storage node. As you might imagine, the command for this is:

# nsradmin -C "NSR storage node"

The output in my lab resembles the following:

nsradmin - NSR storage node

As I mentioned at the start – these have been added into NetWorker 8.2.1.6. If you’re running an earlier release, service pack or cumulative release than that exact version, you won’t find the new features in your installation.

Jan 092012
 

Upgrading NetWorker

So a new version of NetWorker has come out, or is coming out, and it’s been decided that you’re going to upgrade, but you want a few tips for making that upgrade as painless as possible. Here’s my 5 rules for upgrading NetWorker:

  1. Read the release notes. If you’re not going to read the release notes, you are better off staying on your current version, no matter what issues you’re having. I can’t stress enough the importance of reading the release notes and having a thorough grasp of:
    • What has changed?
    • What are the known issues with the current release?
    • What were the resolved issues between the current release and the release you’re currently running?
  2. Do a bootstrap and index backup if upgrading between major or minor releases. If going between service packs on the same release, you can skip the index backup so long as your backups have been successful lately, but ensure you still do a bootstrap backup.
  3. Unload all tapes (physical or virtual) in jukeboxes before the upgrade. You’ll see why shortly.
  4. Upgrade in this order:
    • Storage node(s) on the day of the upgrade, before the NetWorker server
    • Server on the day of the upgrade, after the storage node(s)
    • Client(s) later, at suitable times
  5. After the upgrade but before the NetWorker services are restarted on the storage node(s) and server, delete the nsr/tmp directory on those hosts.

Obviously standard caveats, such as following any additional instructions in the release notes or upgrade notes should of course be followed, but sticking to the above rules as well can save a lot of hassle over time. I’ve noticed over the years that a odd, random problems following upgrades can be solved by clearing the nsr/tmp directory on the server and storage nodes. If there’s no tapes in the jukeboxes when the services first start after the upgrade, there’s less futzing for NetWorker to take care of before it’s fully up and running, too.

Nov 172010
 

In March this year, I ran a NetWorker Usage Survey to gauge the lay of the land in terms of how and on what NetWorker is deployed within the user community. As a result of that, hundreds of people downloaded the March 2010 NetWorker Usage Survey Report.

It’s time to revisit that survey; I’ve adjusted the questions a little based on some of the responses from the previous survey results, and I’m keen for as many answers as possible to make this a worthwhile report. (Don’t forget, the report will be free to download.)

I’ll be aiming to keep the survey running until the end of November, and publish the results early in December. So please, fill out the survey below and have your say!

~

The survey has now closed. The report will be posted soon.

appliances vs Appliances

 Architecture, Backup theory, Support  Comments Off on appliances vs Appliances
Aug 182010
 

We all have appliances, right? Teapots and toasters and microwaves and automatic coffee machines, etc. They’re all appliances. So are clock radios, electric razors, heaters and fans.

They’re appliances.

VTLs, SANs and NASs are not appliances, despite what any vendor would try to tell you. As soon as you’ve got an OS + software layer, you’re moving beyond “appliance” into “black box”. Or maybe we’re talking the difference between an appliance and an Appliance. If a vendor wants to tell you otherwise, they’re not telling you the whole story.

There’s a simple test on whether you’re being sold an appliance, or an Appliance – a simple yes/no question:

Is there a training course for the unit or an instruction manual with more than 1 page of instructions per language?

If the answer is “no”, then congratulations, you’ve got an appliance; if the answer is yes, then despite whatever your vendor wants to tell you, you’ve got an Appliance.

Now, there’s nothing wrong with having an Appliance within your organisation, and in fact I’d suggest that frequently they add a lot of value. VTLs, SANs and NASs, to use the example I previously provided, are all capable of greatly extending the storage and data protection options within your environment and should of course be considered in many architectures.

Knowing that they’re Appliances rather than appliances though means that you can treat them appropriately. I personally don’t care about backing up my toaster, or keeping a close eye on the logs from my microwave. As the appliance complexity increases, I pay more attention – so for instance the most critical appliance in my home is arguably the automatic espresso machine, and since it has blinking lights that can tell me whether I’m able to get a cup of coffee from it or not, I pay attention to it.

Extending this process, when you move from having appliances in your organisation to having Appliances, it’s critical that they are treated as full blown systems that require the same level of support, administration and consideration when it comes to problem resolution. Or another way to consider it, from a support perspective – if there’s an error happening in your environment, don’t ignore the “black boxes” when it comes to problem diagnosis. This means being aware of at least the following:

  • How to view basic status;
  • How to extract logs;
  • Any caveats to reading logs (e.g., are they time/date stamped using a different GMT offset to your environment?);
  • How to review the logs;
  • How to escalate requests to the Appliance vendor.

Once you’ve been working with Appliances for a while, all of these start to come naturally. The big trick for beginners in the Appliance realm though is to ignore the “black box” you’ve been sold and instead be aware of the components and how to access the diagnostic information for the unit. If you can’t, you’ve created a “black hole” – and that’s not something you’ll get a lot of satisfaction from.

Jul 302010
 

I’m curious as to the differences between using a commercial, supported version of Linux in the enterprise and a non-supported one. Now, I know all the regular arguments – they’re implicitly stated in my article about Icarus Support Contracts.

But here’s the beef: I’m not convinced that commercial Linux companies really offer a safety net. Or to put it another way – they may offer the net, but I’m yet to see much evidence that it’s actually secured to anything. It almost seems a bit like the emperor’s new clothes, and I believe we’re seeing a real surge in popularity of distributions such as CentOS for precisely this reason.

Here’s the sorts of things I’ve commonly seem from customers with commercial enterprise Linux distributions who say, log support cases with the Linux distributor:

  • Being advised to just simply apply the latest patches – OK, sometimes this is valid, but we all treat such recommendations with caution;
  • Being advised to search Google forums, etc.;
  • Being mired in finger pointing hell – it seems that most features or components a company will want to log a case over aren’t covered by the expensive support contracts that come with enterprise/commercial Linux;
  • Getting average and/or highly complicated responses that don’t inspire confidence.

In short, I worry that commercial enterprise Linux distributions provide few tangible benefits over repackaged or alternate distributions.

As proof that I’m serious about this subject, I’ll say something that years ago may have made me apoplectic: Even given how little I like Microsoft’s products, my honest observation is that companies with Microsoft support contracts get substantially more benefit at substantially lower cost than those who have similar support contracts with the enterprise commercial Linux vendors.

So, I’m asking people to convince me I’m wrong – or at least provide counter-arguments! If you’re using a commercial, enterprise Linux, please help me understand what value you get out of their support programmes – examples of problems they’ve solved, and how they’ve proved themselves equal to (or better than) support offerings from either Microsoft or other Unix providers. Any examples/stories that touch on data backup/recovery or storage would be of particular interest.

So feel free to add a comment and let me know what you think!

Jul 072010
 

Clients

The Question

It’s usually the case that the biggest part of a NetWorker environment – in terms of resources that are configured, and software deployed, are the clients themselves. When sites look at upgrading their NetWorker environments though, the normal procedure is to upgrade the server and any storage nodes as the first step, then plan to upgrade clients on an “as needed” or “when we get around to it” basis.

This prompted a customer to recently ask me to write a blog article about this topic (thanks, Robert!) Specifically, Robert’s question was – why should I upgrade my clients?

Having worked with several of my clients now for close to a decade, I’m familiar with the scenario: the servers and storage nodes will be at appropriately supported versions of the NetWorker software, but clients are trailing behind, and before you know it your versions may stretch out like a long tail behind your backup server and storage nodes:

Client versionsSo it begs the question – when NetWorker is so good at supporting older client versions, what’s the rush in upgrading old clients? This is a question where an answer of “…because…?” isn’t sufficient, so perhaps first it’s worthwhile considering some common arguments for not upgrading the clients:

  • If it’s not broken, don’t fix it.
  • We had some problems with version X, it’s stable on X+n, so keep it that way. (A variant of the above.)
  • It’s working, so it’s a low priority task.
  • Admins are too busy fire fighting to do unnecessary upgrades.
  • Change control is too tedious.
  • This is the last supported version for this <old> operating system.

The Answer

The generic answer

Each of the above reasons, in their own right, can be a perfectly valid reason. Temporarily stepping away from backup software and looking at say, operating systems, here’s some example reasons why we eventually choose to upgrade operating systems:

  • We explicitly need the new features.
  • New applications require the new features.
  • Poor support on old OS for new hardware (and vice versa).
  • More efficient.
  • Faster.
  • More secure.

We can evaluate a whole host of  reasons, but we can actually boil any upgrade rationale down to one of the following three generic reasons:

  1. Risk – The risk in not upgrading overrides the cost of upgrading. Two common risks are security or reliability.
  2. Features – The currently installed version lacks features that are both available and required in a newer version available.
  3. Support – The currently installed version is either out of support, or is scheduled to no longer be supported as of a known, unacceptably close date.

Note – regarding features: To be a valid upgrade reason, it should be both available and required, not one or the other – and yes, sometimes upgrades are done based on features being required without first checking if they’re available!

When we boil down upgrade reasons to just three generic terms, risk, features and support, it becomes easier to justify either:

  • Having an active programme in place to keep clients up to date or
  • Periodically updating clients.

So going back to NetWorker clients, we can evaluate what sort of reasons in each of the generic categories might prompt an upgrade; I’m going to go backwards through the previous list.

The NetWorker answer

Support

To me, unsupported = broken. So, “if it’s not broken, don’t fix it” stops being a valid reason at the point where client software installed is no longer supported. So for sites that have v7.3.x and lower clients laying around – or come October 1 2010, v7.4.x and lower clients around, you should either:

  • Upgrade to a supported version or
  • Upgrade to the last supported version that is compatible with the client (for very old clients/applications).

If a client is on an unsupported version of the software and it can be upgraded to a support version, leaving it on that unsupported version can introduce unnecessary risk in the environment. While a current version of NetWorker will more than likely keep communicating with an older version of NetWorker, that doesn’t mean that issues can’t happen, and if they do, you want to be able to resolve the issue as quickly as possible. By having a supported version of the client installed, you can considerably streamline the resolution process.

Features

We have a tendency to focus on the backup server (and to a lesser degree), storage node, when looking for features support. For instance, we may want disk backups to be able to do X, or NDMP backups to be able to do Y, and so on. However, feature support isn’t enhanced only at the server layer. In actual fact, a lot of feature support comes from the client software. For instance:

  • If you’re working with Solaris 10 clients that are deployed in non-global domains, having up-to-date client software ensures that you maximise your support of that configuration;
  • If you’re looking at upgrading a host from Windows 2003 to Windows 2008 R2, you’re likely going to need to upgrade the NetWorker client – you need a newer client instance that has more up to date support for the newer operating systems;
  • If you’re wanting to eliminate no-longer-needed licenses within your backup environment, and are looking at getting rid of those ClientPak licenses, you’ll need to make sure that the clients themselves support the removal of the licenses;
  • If you want to be able to do VSS filesystem backups but not have to buy VSS licenses, you’ll need to have a version of the NetWorker client that supports this option;
  • If you want to replace your Oracle 9 database with Oracle 11, you may find yourself needing to upgrade the database module. This in turn may necessitate an upgrade of the client software to support the newer module, too.

Suffice it to say, feature support can be just as important at the client level as it is at the backup server level. In this regard, the release notes will always be an excellent reference – if you’re not sure whether you need to upgrade, check to see what new functionality comes into play on the latest versions of the software.

Risk

The final reason to upgrade is risk – risk that there is a bug or a security issue in the currently installed version of the software that may be resolved in a newer version. Like “Features”, above, your best bet for determining the risk of not upgrading is by referring to the release notes for newer versions of the software. Read the “fixed issues” notes very carefully; it could be that intermittent issues you haven’t yet found time to investigate – or that you have been actively trying to resolve – are actually resolved in a newer version of the software. While we often look at fixed issues in NetWorker release notes for the server and storage node, they can be equally applicable at the client level, too.

When should clients be upgraded?

Once we’ve determined that we can decide to upgrade clients on the basis of either support, features or risk, we must next ask ourselves the question – when should the clients be upgraded? There’s a sister question to this too – how frequently should clients be upgraded?

I’m not going to suggest that your backup server and all its clients should be kept in absolute version lock-step the entire time. If you have the processes, personnel and time to do this, then by all means go ahead – but it isn’t something that you should obsessively worry about. Instead, I’ll offer some generic suggestions; to do this though I’ll refer to major and significant version numbers. Consider say, NetWorker 7.5 SP2; I’d consider the major version number to be 7, the significant version number to be 5, and the service pack to be 2.

  • Aim to keep all clients that support it on at least the same major version number as the backup server;
  • Where time permits try to get clients on the same (or higher*) major+significant version number as the backup server – but as a general rule, ensure that the clients are at least on a supported major+significant version number.
  • Consider getting clients onto the same major+significant+service pack version as the backup server where there are support, risk or feature reasons, i.e.:
    • Where there are new features in the service pack you need, or,
    • Where there are risks in remaining at the current version, or,
    • Where there are support reasons for updating. (E.g., patch available for new SP that would need to be back-ported to your existing version).

You may think that all these answers are a bit vague – and by necessity, they are, since the issues, needs and processes at each site will govern exactly how and why upgrades are done.


* Yes, or higher. Such as for instance, sites that have been running a NetWorker 7.4.x server, but need to run a 7.5 SP2 client for Windows 2008 R2 systems, etc.

May 032010
 

Less than a month ago, Apple released service pack 3 to Snow Leopard – i.e., 10.6.3. A few days after that they released 10.6.3.1 which was apparently only needed in a few instances, but I downloaded and applied anyway due to some irregularities I’d noticed with my OS after installing the vanilla 10.6.3.

It’s recently occurred to me that NetWorker (7.6) has been a heck of a lot more reliable since going to 10.6.3 / 10.6.3.1. As always, it’s a bit of a grey zone, since it’s not officially supported (and there’s definitely some patching required) – hence the wait for 7.6 SP1, but overall I’m now noticing that the client process remains contactable by the server across multiple sleep/wake and/or location transitions, something that it wouldn’t do before. There’s still some other behavioural oddities, but overall, I realised that I’ve not reinstalled the client on my laptop now for over 2 weeks, which is a bit of a record since I installed Snow Leopard. If you’re in a situation where you absolutely have to be running 10.6 and backing up with NetWorker, and knowing it’s not currently supported, I’d suggest you make sure you’re on 10.6.3.1.

Mar 262010
 

RIP NetWareSearch Networking reports that Novell have finally announced the cessation of legacy installs of NetWare on physical machines ceases at the end of March 2010. It seems now that the only remaining way of getting support for NetWare is as part of a migration project to OES2.

As a backup consultant, my history with NetWare is a spotted one. Particularly within NetWorker, it’s always been at best a real pain in the neck to backup. Sure, you could eventually get it to work, but there’s been gaps in support (particularly the jump from client 4.2 to 7.2) and debugging backup problems on NetWare has never been an enjoyable problem. On the other hand, once you got it working it just kept on working and working and working and working and … well, you get the picture.

While operating systems (or operating environments) came and went with considerable regularity in the early days of computing, it’s not often these days that we say goodbye to an operating system entirely. I’m actually struggling to think of the last unique operating system (as opposed to clone/distribution) that went. It may have even been BeOS.

Mar 252010
 

In the last few days, cumulative patch clusters have been released for the following versions of NetWorker:

  • 7.6 – Patch cluster 7.6.0.3 released.
  • 7.5.2 – Patch cluster 7.5.2.1 released.
  • 7.4.5 – Patch cluster 7.4.5.56 released.

As per usual, these haven’t been released to PowerLink, but can be requested via your authorised support partner. Remember that cumulative patch clusters don’t contain any new features – they’re just accumulated key bug fixes. If you’re having any issues with either your current 7.6.0.x, 7.5.2 or 7.4.5.x install, you may want to talk to your support partner about the fixes included in those cumulative patch clusters.

[Edit – 2010-03-26] Apologies, I meant to say that cumulative patch cluster 7.4.5.6 had been released for the 7.4.5 tree, not 7.4.5.5.