One of the things that I forgot to mention earlier that’s included in the NetWorker 18.1 release is a function called breakthrough logging. If you look at the release notes, you’ll see it described thusly:
NetWorker 18.1 supports the breakthrough logging feature that helps you to understand the steps involved in the various operations such as Save, Recover, and Clone. You can also review the log associated with each step of operation and determine the precise step that failed during the execution of that particular operation. The NetWorker Administration Guide provides details about this feature.
I was curious to see what this was about, and since I had some clients in my home lab that I hadn’t upgraded to NetWorker 18.1 yet, I was also able to get a before/after view.
Breakthrough logging may not be a whoa! product feature like say, vProxy or a HTML5 interface. In fact, the examples I’ll be giving are mostly command line based. However, the logging feeds up into tasks executed by NetWorker normally, and whether or not it’s whoa!, it’s one of those options that’ll make debugging an environment so much easier. Coming from a background of NetWorker implementation and support, this is honestly a really great feature.
To get an idea of what breakthrough logging does for you, let’s compare the backup of a simple subdirectory on a NetWorker 9.2 client vs the same operation once the client has been upgraded to 18.1. (As you might gather from this: breakthrough logging is a function of client software. You won’t see breakthrough logging for a client that hasn’t been upgraded.)
(If you’re not familiar with them, the “-q -LL” options are standard options used when save is automatically invoked. I tend to use them unless I’m looking for file-by-file output for manual saves.)
That’s a fairly straight-forward backup – it tells you what’s been backed up when the backup completes and what the status of the backup was.
But what if something goes wrong? How do you know where in the backup process an error occurred? That is precisely what breakthrough logging is about. Let’s look at the same backup on the same client after the client was upgraded to NetWorker 18.1:
Whoa!
That’s a lot more information now. If we compare the first output to the second, the critical thing really to look for is where the following line is logged in relation to the rest of the information:
save: /root/bin 11 KB 00:00:01 12 files
In our ‘-q -LL’ output mode, we don’t see information until the backup is complete; even if that weren’t there, the main information you see in a manual save is a line by listing of what’s being backed up.
With breakthrough logging, you’re seeing a complete enumeration of the steps that NetWorker goes through in order go get a backup done – let’s look at those steps:
Step 1 – The save command has started
Step 2 – Confirming the client details, and identifying there’s a backup of that saveset already in the system.
Step 3 – Contact the NetWorker server via its nsrd process to get a handle to the backup target.
Step 4 – Talk to the storage node’s nsrmmd (in this case the NetWorker server).
Step 5 – Actually reading the files from the client for the backup.
Step 6 – Completed, wrapping up.
This may not seem like all that useful information, but that’s because it’s working. So, on the same client, let’s see what happen when I set the storage node to a host that doesn’t have any media in the Backup_01 pool:
Here we get a different result – the client has been configured with a single named storage node, but that storage node doesn’t have access to a Backup_01 pool volume. So NetWorker balks when the request is made: it can’t allocate the backup in such a way under the current configuration that allows it to take place. But importantly – you can see that the client was able to reach out and communicate successfully with the NetWorker server.
I know that’s a bit of a contrived example, but the point is simple: if you’re sitting at a client trying to diagnose an issue, you’ll automatically have additional information provided to you by NetWorker to help you isolate the problem.
Breakthrough logging doesn’t just apply to backups – it also works for recoveries and cloning. Let’s look at a recovery first. You get the breakthrough logging information from a command line recovery, but I wanted to show what it looks like through NMC’s Recover option, too. Here’s an example of a completed recovery:
Again, what you’ll see here is that if an error happens (e.g., comms failure, etc.), you’re going to have a much better picture of where things stopped working.
Finally, let’s take a look at cloning. Cloning is normally going to be executed through a policy, in which case you’ll see cloning information percolate up into the NMC monitoring results, and be reported in the appropriate /nsr/logs/policy directory as well. In order to capture a complete cloning session, I picked a simple backup I’d executed for testing and cloned that manually, from the command line. For the purposes of brevity, I’ve just included one screen’s output from the command window:
Again, you’ll see that there’s a lot of information produced as NetWorker articulates its step-through of the internal sequences to get the cloning operation done. In normal operations, you won’t care about this information – in normal operations you’ll not really even bother to go looking for the information, of course – you’re just going to want to see a ‘completed successfully’ alert in NMC, Data Protection Central or Data Protection Advisor.
But if something doesn’t work? That’s when this information will come in handy. Is it using old style cloning or new style cloning? Well we can see it’s invoking nsrrecopy, which means it’s a recover pipe to save clone operation rather than the old-school cloning operation. We can see what storage nodes are getting involved, and where things are up to in the cloning operation. In fact, let’s look at cloning 3 savesets – though that’ll require a couple of extra screen captures:
We can see there where NetWorker enumerates through each saveset it has to clone and confirms details for the clone of the saveset. Again, if you are debugging an issue this will be a life-saver. No head-scratching about where NetWorker is up to in the operation or what it’s doing: it’s all laid out for you to instantly see.
So here’s my tip: when you upgrade to NetWorker 18.1, make sure you upgrade your clients as well, so you can reap the rewards.
Oh, one more thing…
Check out what’s in NMC 18.1 just under the link to launch the NetWorker server controls:
NMC has had online help for a while, and in NetWorker 9 it transitioned to being web based, but check this out – a link in NMC directly to some new information about NetWorker sizing, and NetWorker best practices. Click on that and you’ll get a web browser open with:
The sizing information is good, of course, but if you keep scrolling down that page, you’ll get to a real nugget:
Firewall information has always been provided, of course, but here, rather than lunging for one or more of the PDF guides to pull out specific firewall port information, you now have an exact list of firewall details: never more than a couple of clicks away.
Breakthrough logging is a decent addition.
Often times recovery fails with ” recover error with 2 files”
without highlighting which files they were or what the issue was. Need to check in 18.x if there is a respite.