One of the settings that can be made within a group is the ‘inactivity timeout’ setting. This refers to client inactivity. This is often erroneously considered to be a group timeout setting, but it’s not.

Now, to start with, the architecture of having a client inactivity timeout setting is, I believe, flawed, and should be addressed by adding heartbeat functionality between the NetWorker server and the client backup process*.

There are a plethora of situations that don’t fall into client inactivity. These include:

  • Blocking IO call failing (can happen to just about any product)
  • Saveset initiation request sent but not responded to. (This is a tricky one to define – that seems to be the point where the failure happens, but it’s almost impossible to diagnose.)
  • Backup server’s bootstrap/index:server saveset waiting for media on the backup server.

There have been various attempts to fix these situations over the years – for instance, most recently there were patches introduced into the 7.5 service pack stream to try to prevent a situation where a group would hang on startup probe. As is always the case with hanging situations, it’s difficult to say for sure whether those potential issues were well and truly dealt with.

What it remains clear though, and it’s really important to remember, is that just because you’ve set a “client inactivity” timeout within a group doesn’t guarantee that the group will timeout after a certain period of inactivity. I.e., it doesn’t excuse you from confirming on a daily basis whether your groups have finished or not.

Monitoring can be achieved a few different ways:

  • Literally checking each group that is still running in NMC at a certain point in the day and determining whether it should be running or if it is hung.
  • Paying special attention to savegroup completion reports that tell the group is “aborted, already running” (though that means missing a hung group for around 24 hours).
  • Scripting a check and alert for still-running groups – like the NMC option, but automated.

It would be great to say that there should never be a case where a group hangs and doesn’t complete, but I recognise this is one of those things that’s difficult to program, and in actual fact is almost impossible to guarantee. Could it be handled better? Undoubtedly; it’s just I’m enough of a pragmatist to know that it’s never going to be perfect.

The catch-cry of the backup administrator should be “constant vigilance!” As I’ve discussed previously in posts about enacting zero error policies, it’s not about trying to configure a “set and forget” system where there’ll never be an issue, it’s about always having your finger on the pulse and never, ever accepting that there will be regular alerts for “events-that-look-like-errors-but-you-know-they’re-not”.

So while the client inactivity timeout in a group will save you from some mundane aspects of group administration, it won’t let you ignore monitoring your groups for unexpected states.

__________
* By flawed, I mean:

Currently the backup process works as follows:

  1. Server instructs client to start backing up
  2. Client starts sending data to appropriate storage node/nsrmmd process
  3. If client fails to send any data for ‘inactivity timeout’ minutes, backup is considered to have failed, and restart is run if necessary.

This doesn’t suit situations where there’s a dense filesystem walk taking place, and in fact it really, really should work as follows:

  1. Server instructs client to start backing up
  2. Client starts sending data to appropriate storage node/nsrmmd process
  3. Every X (e.g., 90) seconds or so when no data has been sent, the storage node/nsrmmd process asks the client if the save is still running.
  4. If the client responds within X seconds, keep waiting.

That’s the sort of heartbeat mechanism that should be used…

 

As I mentioned in an earlier post, EMC have announced on their community forum that there are some major changes on the way for ADV_FILE devices. In this post, I want to outline in a little more detail why these changes are important.

Volume selection criteria

One of the easiest changes to describe is the new volume selection criteria that will be applied. Currently regardless of whether it is backing up to tape, virtual tape, or ADV_FILE disk devices, NetWorker uses the same volume selection algorithm – whenever there are multiple volumes that could be chosen, it always picks volumes to write to in order of labeled date, from oldest to most recent. For tapes (and even virtual tapes), this selection criteria makes perfect sense. For disk backup units though, it’s seen administrators constantly “fighting” NetWorker to reclaim space from disk backup volumes in that same labeling order.

If we look at say, four disk backup units, with the used capacity shown in red, this means that NetWorker currently writes to volumes in the following order:

Current volume selection criteriaSo it doesn’t matter that the first volume picked also has the highest used capacity – in actual fact, the entire selection criteria is geared around trying to fill volumes in sequence. Again, that works wonderfully for tapes, but it’s terrible when it comes to ADV_FILE devices.

The new selection criteria for ADV_FILE devices, according to EMC, is going to look like the following:

Improved volume selection criteriaSo, recognising that it’s sub-optimal to fill disk backup units, NetWorker will instead write to volumes in order of least used capacity. This change alone will remove a lot of the day to day management headaches of ADV_FILE devices from backup administrators.

Dealing with full volumes

The next major change coming is dealing with full volumes – or alternatively, you may wish to think of it as dealing with savesets whose size exceeds that of the available space on a disk backup unit.

Currently if a disk backup unit fills during the backup process, whatever saveset being written to that unit just stays right there, hung, waiting for NetWorker staging to kick in and free space before it will continue writing. This resembles the following:

Dealing with full volumesAs every NetWorker administrator who has worked with ADV_FILE devices will tell you, the above process is extremely irritating as well as extremely disruptive. Further, this only works in situations where you’re not writing one huge saveset that literally exceeds the entire formatted capacity of your disk backup unit. So in short, if you’ve previously wanted to backup a 6TB saveset, you’ve had to have disk backup units that were more than 6TB in size, even if you would naturally prefer to have a larger number of 2TB disk backup units. (In fact, the general practice has been when backing up to ADV_FILE devices to ensure that every volume can fit at least two of your largest savesets on it, plus another 10%, if you’re using the devices for anything other than just intermediate-staging.)

Thankfully the coming change will see what we’ve been wanting in ADV_FILE devices for a long time – the ability for a saveset to just span from one volume it has filled across to another. This means you’ll get backups like:

Disk backup unit spanningThis will avoid situations where the backup process is effectively halted for the duration of staging operations, and it will allow for disk backup units that are smaller than the size of the largest savesets to be backed up. This in turn will allow backup administrators to very easily schedule in disk defragmentation (or reformatting) operations on those filesystems that suffer performance degradation over time from the mass write/read/delete operations seen by ADV_FILE devices.

Other changes

The other key changes outlined by EMC on the community forum are:

  • Change of target sessions:
    • Disk backup units currently have a default target parallelism of 4, and a maximum target parallelism setting of 512. These will be reduced to 1 and 32 respectively (and of course can be changed by the administrator as required), so as to better enforce round-robining of capacity usage across all disk backup units. This is something most administrators will end up doing by default, but it’s a welcome change for new installs.
  • Full thresholds:
    • The ability to define a %full threshold at which point NetWorker will cease writing to one disk backup unit and start writing to another. Some question whether this is useful, but I can see the edge of a couple of different usage scenarios. First, as a way of allowing different pools to share the same filesystem, making better use of capacity, and secondly, in situations where a disk backup unit can’t be a dedicated filesystem.

When we add all these changes up, ADV_FILE type devices are going to be back in a position where they’ll give VTLs a run for their money on cost vs features. (With the possible exception being the relative ease of device sharing under VTLs compared to the very manual process of SAN/NAS sharing of ADV_FILE devices.)

 

Normally you don’t want to be in this position, but sometimes you’ll strike a situation where the only possible location of data that you need to get back is in a saveset that aborted (i.e., failed) during the backup process. Now, if the saveset/media is almost completely hosed, you’re probably going to need to recover using the scanner|uasm process, but if it was just a case of a failed backup, you can direct a partial saveset recovery using the recover command.

When you’re at this point the first thing you need to do is find the saveset ID of the aborted saveset, but I’ll leave that as an exercise to the reader. Now, once you’ve got the aborted saveset ID, it’s as simple as running a saveset recovery. The basic command might look like this:

C:\> recover -d path -s buServer -iN -S ssid

Where:

  • ‘path’ is the path that you want to recover to. Note that in these situations, it’s usually a very, very good idea to make sure you recover to somewhere new, rather than overwriting any existing files.
  • ‘buServer’ is the backup server that you want to recover from.
  • ‘ssid’ is the saveset ID for the aborted saveset that you want to recover from.

Depending on whether you’re doing a directed recovery, etc., you may end up with a few additional arguments, but the above is fairly much what you need in this situation. (If you’re confident that a specific path or file you want back is going to be in the part of the saveset backed up, you can always add that path at the end of the recovery command, too.)

Once the recovery runs, you’ll get a standard file-by-file listing of what is being recovered, but the recovery will end with what looks like an error – it’s effectively though just a notification that NetWorker has hit the data that was ‘in transit’, so to speak, when the saveset was aborted. This error will look similar to the following:

5041:recover: Unable to read checksum from save stream

16294:recover: Encountered an error recovering C:\temp2\Temp\744\win_x86\networkr\hba\emc-homebase-agent-6.1.2-win-x86.exe

53363:recover: Recover of rsid 851692923 failed: Error receiving files from NSR server `tara'

The process cannot access the file because it is being used by another process.

Received 231 matching file(s) from NSR server `tara'

Recover errors with 1 file(s)

Recover completion time: 4/20/2010 3:41:12 PM

At that point, you know that you’ve got back all the data you’re going to get back, and you can search through the recovered files for the data you want.

(As an aside, don’t forget to join the forums if you’ve got questions that aren’t answered in this blog.)

 

Having observed Oracle’s strategy now for a while since the acquisition of Sun, and many discussions with a quite a few customers, it’s clear that Oracle is gunning for the “entire vertical” model. Quite simply, their primary focus appears now to be selling an entire solution to a company, starting with the low level storage, and extending all the way to the application tier.

There’s nothing wrong with that kind of strategy – so long as it doesn’t alienate companies who want to buy piecemeal.

Oracle however are most definitely alienating the educational institutions. Several institutions I deal with now have policies that require end-of-life Sun kit to be replaced with comparable Linux or Windows solutions, unless an absolute rock-solid business case can be built.

With educational support rapidly eroding, Solaris as a tent-pole Unix platform is well and truly dead.

[Original post below, 22-04-2010]

I was told yesterday that one of the changes Oracle has wrought at Sun is the killing of all educational discount programmes. Apparently while they’re still listed on the Sun websites, they’re unavailable. Another fascinating change is collapsing support programmes from multiple levels of varying cost to one single level.

From a Unix perspective, I grew up on Solaris, and I’ve always seen the Unix world split into two camps. On one side you’ve got HPUX and AIX, dominated by ‘smit’, and the other side was led by Solaris, with Tru64 close behind.

The HPUX and AIX approach to Unix has always been an interesting one. It’s about rigid controls, and it’s appealed to formal environments and procedure-oriented enterprises. I still remember a comment made by a senior BHP IT manager when I still worked in Newcastle. I’m modifying it slightly so that this article doesn’t get mired down in faeces:

In BHP IT Melbourne if you want to go the toilet, you hold a meeting about it. In BHP IT Wollonging if you want to go to the toilet, you consult a huge procedure about doing it. In BHP IT Newcastle, you do it in the corridor while you keep going with your work.

I know, it’s a wee crass, but as much as anything I saw it as a statement about the platforms in use at the time – particularly Newcastle vs Wollongong. You see, the Newcastle Unix team was dominated by Solaris, with a few Tru64 boxes and a couple of HPUX boxes (hell, even a couple of AT&T boxes). The Wollongong team was dominated by AIX.

What it comes down to is that the administrator mentality behind Solaris is all about free thinking. Not in a hippy sort of way, but in a “hey, here’s a Unix. Do whatever the hell you like with it” sort of way. It’s the result of having year after year of students at Universities using Solaris because that was the cheapest and most flexible Unix for the Universities to deploy. In this case by “free thinking” I’m not referring to any OSS ideals, but to the notion that it’s a full Unix that isn’t constrained by what the vendor feels you should do with it.

I’m taking a roundabout way of getting there, but I think the greatest damage Oracle is doing to Solaris is making the entire Sun platform less attractive to educational markets. People tend to stick to the platforms they learn at University – at least for a while – and so the overall Sun educational discount programme has always been a very clever one: hook them while they’re still learning, teach them that they can use the platform for whatever they want, and they’ll keep coming back to it once they’re out in the work force. This becomes a very powerful drag-sales method. Graduates come out of University looking for jobs on the platforms they have experience with. As they become team leaders or middle managers, they continue to advocate those platforms unless there’s a very strong reason to void the emotional attachment they have to a platform. Net result? The discounts to the educational market are recouped through the full prices in the commercial market. (I’d suggest that at a desktop/laptop level, Apple has been working at this now for some time, and it’s starting to build momentum that Microsoft will have trouble halting.)

Oracle clearly don’t understand that drag-sales model as it has applied with Sun. By killing off educational discount programmes for the entire Sun platform and making the Solaris operating system more costly to install and support, they’re eroding the “use it for anything” base market and mind share that has always been so critical to the continuing popularity of Solaris. I’m sure HP and IBM are both very pleased with this new direction that Oracle is taking. Oracle is making the jobs of HP and IBM sales people that much easier.

If Oracle lock in their current strategy and force this change, Solaris as a tent-pole Unix platform is dead. Somehow, I doubt Oracle would even care.

 

I had been aware for a while from an NDA conversation that these changes were on the way, but of course have not been able to discuss them.

However, with EMC opening up discussion on the EMC Community Forum – i.e., out in public, I now feel that I can at least discuss how excited I am about the coming ADV_FILE changes.

For some time I’ve railed against architectural failings in ADV_FILE devices, and explained why those failings have led me to advocate the use of VTLs over ADV_FILE devices. As announced on this thread in the forums by Paul Meighan, many of those architectural limitations are soon going to be relegated to the software evolutionary junkpile. In particular, EMC have stated in the forum article that the following changes are on the way:

  1. Volume selection criteria becomes intelligent. NetWorker currently uses the same volume selection criteria for disk backup as it does for tapes. This means that the oldest labelled volume with free space on it always gets picked first, and subsequent volumes get picked following this strategy. This has meant that backup administrators have continually fought a running battle to keep the original disk backup units staged more regularly than others. Instead, NetWorker will now pick ADV_FILE volumes in order of maximum capacity free, which will free a lot of backup administrators from the overall pain of day to day capacity management.
  2. Savesets can span advanced file type devices. Finally, the gloves are off! With the ability to have savesets cease writing to one disk backup unit and move over to another, NetWorker ADV_FILE devices will be able to serve as a scaleable and transparent storage pool, backups will flow from one device to another in exactly the way they always should have.
  3. Session changes. To reflect round-robining best practices, the default target sessions for disk backup units will drop from 4 to 1.

When we add together the first two changes, we get powerful enhancements in NetWorker’s disk backup functionality. Do other products already do this? Yes, I’m not suggesting that NetWorker is the first to this, but it’s fantastic to finally see this functionality coming into play.

Until this point, NetWorker has suffered the continual challenge with disk backup of constant administrative overheads and trying to plan in advance the best possible space allocation technique for disk backup filesystems. Once these changes come into play: no more challenge on either of these fronts.

Folks, this is big. Yes, these changes should have come a long time ago, but I’m not going to let the delay get in the way of being damn grateful that they’re finally coming.

 

For quite a while I worked under the assumption that you could do the following with directives in NetWorker:

<< /path >>
+skip: *.mp3
<< /path/subpath/criticalpath >>
forget

The logic of this is that it should be possible to skip files in one directory, but forget that directive in a lower directory and thus be able to still backup files matching a particular criteria in a subpath.

Recent discussions on the NetWorker mailing list left me questioning whether I was correct in my assumption. I thought I’d tested it long ago, but the discussions on the list (and the tests that I did) seemed to indicate this wasn’t the way NetWorker worked.

It turns out I was testing incorrectly. Instead of testing with an exact specification such as the above, I was testing “lazily”:

<< /path >>
+skip: *
<< /path/subpath/criticalpath >>
forget

The mistake that I made was in the “*” vs the “*.mp3″ I should have been testing my use case scenario. In short:

  • Obviously skipping “*” will result in NetWorker determining that everything is being skipped, at which point there is no need to continue to traverse any directory path beneath the point in which the “skip *” is encountered.
  • However, if just skipping a particular pattern, then NetWorker will have to continue to traverse all subdirectories from the path it encounters the skip command, meaning that the forget directive will still be honoured at a deeper directory path.

So I wasn’t wrong about my long-term belief, I just tested incorrectly.

This does mean that you can use skip, followed by forget, so long as your skip isn’t too open in its selection criteria.

 

The challenge

Recently a customer asked me if it is possible to install and use Networker on Opensolaris. Opensolaris itself is a open-source operating system based on the well-know Solaris. Opensolaris has some unique features such as ZFS (which offers features such as on-the-fly compression and on-the-fly deduplication) and COMSTAR (which enables the operating system to export its storage via FC-SAN and iSCSI).

Although Networker is not yet certified for Opensolaris (there is an open RFE to do that) it is certified for Solaris. So I tried to install the most recent version at that time 7.5.2 with pkgadd on Opensolaris build 134 which ran as expected.

On first start nothing happened. It turned out nsrexecd requires two ssl libraries missing on opensolaris:

admin@opensolaris:/# ldd /usr/sbin/nsrexecd
libcommonssl.so =>       /usr/lib/nsr/amd64/libcommonssl.so
libc.so.1 =>     /lib/64/libc.so.1
libssl.so.0.9.7 =>       NOT FOUND
libcrypto.so.0.9.7 =>    NOT FOUND
libmp.so.2 =>    /lib/64/libmp.so.2

Checking the files it turned out the libraries itself are there but the version number does not match: nsrexecd required 0.9.7, opensolaris ships with 0.9.8 (=newer). So I tried to link the files accordingly. Checking again yielded:

admin@opensolaris:/# ldd /usr/sbin/nsrexecd
libcommonssl.so =>       /usr/lib/nsr/amd64/libcommonssl.so
libc.so.1 =>     /lib/64/libc.so.1
libssl.so.0.9.7 =>       /lib/64/libssl.so.0.9.7
libcrypto.so.0.9.7 =>    /lib/64/libcrypto.so.0.9.7
libmp.so.2 =>    /lib/64/libmp.so.2

So from the library dependency point of view everything looked good and nsrexecd was able to start as well.

The next step involved an attempt to start a local save job:

admin@opensolaris:/#save /etc
61261:save: Failed initialize ports from nsrexecd on "opensolaris"
39078:save: RAP error: Service not available.
4196:save: Failed to get port range from local nsrexecd: Service not available.
3817:save: Using networker-server as server

/etc
/etc/hosts
[...]

A few error messages, but that was expected for the first save.

In a second step i tried to start a job from the networker server itself. This job failed entirely. Looking at the logs it seemed nsrexecd was not started on the client. So I (re)-started nsrexecd on the client and initiated the save job from the server a second time. Nothing changed. The server complained about being unable to connect to the client.

On the client no nsrexecd was not running anymore. That was even stranger because i just restarted the process prior starting the backup.

On subsequent tests I noticed nsrexecd dies every time i invoke a save job – even a local save job.

So i did some tests with debugging turned on:

admin@opensolaris:/# nsrexecd -D9
lg_stat(): Calling stat64().
[....]
[....]
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 68 Attempting to register 390113 (vers 1) service with portmapper (111)
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 60 Successfully registered service 390113 with portmapper (111)
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 23 mondaemon_check count 1
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 16 checking file ..
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 17 checking file ...
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 18 checking file sec.
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 26 checking file nsrladb.lck.
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 30 checking file product.res.lck.
0 1270119900 2 0 0 5 654 0 opensolaris nsrexecd 2 %s 1 0 28 lg_open(): Calling open64().
0 1270119900 2 0 0 5 654 0 opensolaris nsrexecd 2 %s 1 0 28 lg_open(): Calling open64().
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 28 @(#) Product:      NetWorker
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 34 @(#) Release:      7.5.2.Build.452
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 22 @(#) Build number: 452
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 47 @(#) Build date:   Thu Feb  4 22:35:03 PST 2010
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 29 @(#) Build arch.:  sol10amd64
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 53 @(#) Build info:   DBG=0,OPT=-O2 -fno-strict-aliasing
0 1270119900 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 35 clu_is_cluster_host_lc(): ENTRY ...
0 1270119900 2 0 0 5 654 0 opensolaris nsrexecd 2 %s 1 0 28 lg_open(): Calling open64().
0 1270119900 2 0 0 5 654 0 opensolaris nsrexecd 2 %s 1 0 28 lg_open(): Calling open64().
0 1270119900 2 0 0 5 654 0 opensolaris nsrexecd 2 %s 1 0 30 lg_lstat(): Calling lstat64().

When starting a save job (either locally or remotely) nsrexecd dies:

0 1270119985 2 0 0 2 654 0 opensolaris nsrexecd 2 %s 1 0 33 Found 390113 program on port 7937
0 1270119985 2 0 0 1 654 0 opensolaris nsrexecd 2 %s 1 0 27 mondaemon_kill_check: entry
0 1270119985 2 0 0 2 654 0 opensolaris nsrexecd 2 %s 1 0 33 Found 390436 program on port 9327
0 1270119985 2 0 0 3 654 0 opensolaris nsrexecd 2 %s 1 0 84 RPC Authentication: RPCSEC_GSS negotiated GSS Legato as the authentication mechanism
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 53 auth_thread_inc_count(): 1 child threads are running.
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 21 clu_is_virthost:ENTRY
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 26 input hostname=opensolaris
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 43 clu_is_virthost():EXIT unknown cluster type
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 26 clu_is_localvirthost:ENTRY
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 26 input hostname=opensolaris
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 48 clu_is_localvirthost():EXIT unknown cluster type
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 26 clu_is_localvirthost:ENTRY
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 24 input hostname=127.0.0.1
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 48 clu_is_localvirthost():EXIT unknown cluster type
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 109 Failed to get user rights: Could not find authentication information for daemon number: 0, daemon instance: 0
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 26 clu_is_localvirthost:ENTRY
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 28 input hostname=192.168.180.2
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 48 clu_is_localvirthost():EXIT unknown cluster type
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 28 lg_open(): Calling open64().
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 74 Adding ssnchnl:     session id = 2  ssn (pointer) = f62570  ops = 57e1a0    fd = 13
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 30 lg_lstat(): Calling lstat64().
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 69 RPC Authentication: admin/opensolaris@ authenticated using GSS Legato
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 70 RPC Authentication: Non-encrypted channel negotiated for ip: 127.0.0.1
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 39 Channel exited with status: (unknown) 0
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 39 Removing ssnchnl:   ssn = f62570    fd  = 13
0 1270119985 2 0 0 6 654 0 opensolaris nsrexecd 2 %s 1 0 53 auth_thread_dec_count(): 0 child threads are running.
Segmentation Fault (core dumped)

Doing some further tests yielded that backups initiated locally are running more or less successfully (with some error messages) and are indeed recoverable. Backups initiated remotely are not working due to nsrexecd crashing.

Analyzing the core dump from nsrexecd left behind yields:

admin@opensolaris:/nsr/cores/nsrexecd# pstack core
core 'core' of 654:     nsrexecd -D9
-----------------  lwp# 1 / thread# 1  --------------------
fffffd7fff21f783 t_delete () + 33
fffffd7fff21f42e realfree () + 5e
fffffd7fff21fbe2 cleanfree () + 52
fffffd7fff21ee61 _malloc_unlocked () + a1
fffffd7fff21ed86 malloc () + 2e
fffffd7fff2063be calloc () + 46
fffffd7ffe2c5291 netconfig_dup () + 21
fffffd7ffe2c4139 getnetconfigent () + d1
fffffd7ffe2de3e7 __rpc_getconfip () + 28f
fffffd7ffe2b85e1 getipnodebyname () + 29
fffffd7ffe338c88 get_addr () + 138
fffffd7ffe338883 _getaddrinfo () + 493
fffffd7ffe338b24 getaddrinfo () + c
000000000051258b lg_inet_pton () + 6b
0000000000475bb3 is_addr_match () + 33
0000000000475caf ???????? ()
00000000004f7c89 _authenticate_varp () + 1c9
00000000004f533d svc_dispatch_varp () + bd
00000000004f54b1 svc_getreq_poll_varp () + c1
000000000046b339 nsrexec_svc () + 449
000000000046f471 main () + 10a1
000000000045aafc _start () + 6c
-----------------  lwp# 2 / thread# 2  --------------------
fffffd7fff28dbba __pollsys () + a
fffffd7fff22bcca poll () + 62
000000000051d1ce lg_poll () + e
0000000000469a3c ???????? ()
000000000051e5a3 ???????? ()
fffffd7fff284ae4 _thrp_setup () + bc
fffffd7fff284da0 _lwp_start ()

Using mdb:

admin@opensolaris:/nsr/cores/nsrexecd# mdb /usr/sbin/nsrexecd core
Loading modules: [ libc.so.1 ld.so.1 ]
> $C
fffffd7fffde0810 libc.so.1`t_delete+0x33()
fffffd7fffde0840 libc.so.1`realfree+0x5e()
fffffd7fffde0880 libc.so.1`cleanfree+0x52()
fffffd7fffde08b0 libc.so.1`_malloc_unlocked+0xa1()
fffffd7fffde08d0 libc.so.1`malloc+0x2e()
fffffd7fffde08f0 libc.so.1`calloc+0x46()
fffffd7fffde0920 libnsl.so.1`netconfig_dup+0x21()
fffffd7fffde0950 libnsl.so.1`getnetconfigent+0xd1()
fffffd7fffde0990 libnsl.so.1`__rpc_getconfip+0x28f()
fffffd7fffde0a20 libnsl.so.1`getipnodebyname+0x29()
fffffd7fffde0b90 libsocket.so.1`get_addr+0x138()
fffffd7fffde0c40 libsocket.so.1`_getaddrinfo+0x493()
fffffd7fffde0c50 libsocket.so.1`getaddrinfo+0xc()
fffffd7fffde0cc0 lg_inet_pton+0x6b()
fffffd7fffde0e30 is_addr_match+0x33()
fffffd7fffde0e60 0x475caf()
fffffd7fffde0ea0 _authenticate_varp+0x1c9()
fffffd7fffde8f20 svc_dispatch_varp+0xbd()
fffffd7fffdf8fa0 svc_getreq_poll_varp+0xc1()
fffffd7fffdfcad0 nsrexec_svc+0x449()
fffffd7fffdffcd0 main+0x10a1()
fffffd7fffdffce0 _start+0x6c()

So from my first observations the crash has something to do with memory allocation/reallocation and with network functions (based on “netconfig_dup”). Due to my limited knowledge on the libc and its internal functions I was unable to dig deeper.

Unsatisfied with the current state (local initiated backups and recoveries are working, remotely arent) I tried several things:

  • Networker client 7.6
  • Networker client 7.6.1
  • Disabling IPv6
  • Using dependent libraries from Solaris 10 x86
  • and so on

But without success. nsrexecd kept crashing.

Due to a mistake I accidentally installed 7.4.5 and to my surprise it worked fine – even remote save jobs are running perfectly smooth.

I have not yet checked if the newer ssl libraries are causing the problem. Judging from the error stack trace I would trend to say so.

Conclusion

Although officially unsupported by EMC using networker client 7.4.5 works fine on Opensolaris. Even using ZFS as file system is supported (it is since 7.3.2).

Using version 7.5.x or 7.6.x causes nsrexecd to crash thus making remotely initiated saves impossible while locally initiated jobs run fine.

So if you need to backup your opensolaris-based system the author recommends to use networker client 7.4.5 over 7.5.x or 7.6.x.

About the author

Ronny Egner is working as a freelancer focused on Oracle databases, UNIX operating systems and EMC / Legato Networker. He is based in Germany (Europe) and is available for projects all over the world. His blog can be found at http://blog.ronnyegner-consulting.de.

 

In the previous article, I covered the first five of ten reasons why tape is still important. Now, let’s consider the other five reasons.

6. Tape is greener for storage

Offline storage of tape is cheap, from an environmental perspective. Depending on your locality, you may not even have to keep the storage area air-conditioned.

Disk arrays and replicated backup server clusters don’t really have the notion of offline options. Even if they’re using MAID, the power consumption for the psuedo-offline part of the storage will be higher than that for unpowered, inactive tape.

7. Replicated tape is cheaper than replicated disk

And by “replicated tape” I mean cloning. Having clones of your tapes is a cheaper option than running a system with full replication. Full replication requires similar hardware configurations on both sides of the replica; cloning a tape requires – another tape. That’s a lot cheaper, before you even look at any link costs.

8. Done right, tapes are the best form of thin provisioning you’ll get

Thin provisioning is big, since it’s an inherent part of the “cloud” meme at the moment. Time your purchases correctly and tape will be the best form of thin provisioning within your enterprise environment.

9. Tape is more fault tolerant than an array

Oh, I know you’ve got the chuckles now, and you think I’ve gone nuts. Arrays are highly fault tolerant – looking at RAID alone, if your disk backup environment is a suite of RAID-6 LUNs, then for each LUN you can withstand two disk failures. But let’s look at longer term backups – those files that you’ve backed up multiple times. Some would argue that these shouldn’t be backed up multiple times, but that’s an argument that doesn’t translate well down into the smaller enterprises and corporates. Sure, big and rich companies can afford deduplicated archiving solutions, but smaller companies have to make do with the traditional weekly fulls kept for 5 or 6 weeks, and monthly fulls kept for anywhere between 1 and 10 years will have the luxury of a potentially large number of copies of any individual file. The net result? Perhaps as much as 50% of longer term recoveries will be extremely fault tolerant – if the March tape fails, go back to the February tape, or the January tape, or the December tape, etc. This isn’t something you really want to rely on, but it’s always worth keeping in mind regardless.

10. Tape is ideally suited for lesser RTO/RPOs

Sure if you have RTOs and RPOs that demand near instant recovery with minimum data loss, you’re going to need disk. But when we look at the cheapness of tape, and practically all of the other items we’ve discussed, the cost of deploying a disk backup system to meet non-urgent RPOs and RTOs seems at best a case of severe overkill.

 

I’ve now opened forums on the primary nsrd.info site. While I’d initially planned to try to setup an alternate style of forums to others that are out there, it’s clear from discussions I’ve been having with various people that a good old traditional forum approach is actually the best way – particularly since the number of visitors that arrive to the blog daily come on the back of questions about NetWorker!

I’d encourage as many people as possible to jump across to the forums. Obviously being brand new, there may be some bugs in the system that we’ll need to work out, but let me know if you strike some issue that we need to address. (Over time I’ll be looking for some additional forum moderators.)

While I’ve not quite decided how to do it yet, over time as questions get answered on forums I’d like to have a repository of “known solutions” available, so please, if you use the forums to ask a question and someone gives you a great solution, return and make a note of it so that over time we can build up a really useful system.

You can access the forums at nsrd.info/forum.

 

Various companies will spin you their “tape is dead” story, and I’m the first to admit that the use pattern for tapes is evolving, but to anyone who claims that tape has lost its relevance, I’ll argue otherwise.

This is part 1 of a 2 part article, and we’ll cover reasons 1 through 5 here.

1. Tape is cheap

Comparatively tape is still significantly cheaper than disk. In AUD, from end-resellers you can buy individual LTO-4 cartridges (800GB native) for $50. Even at a discount price, in Australia you’ll still pay around $90 to $110 for a 1TB drive (the closest comparison).

2. Tape is offline

If your backup server is using traditional backup to disk and is infected by a destructive virus or trojan, you can lose days, weeks, months or perhaps even years of backups.

No software, no matter how destructive (unless we’re talking Skynet levels of destruction) is going to be able to reach out from your infected computers and destroy media that’s sitting outside of your tape libraries. It’s just not going to happen. There’s a tonne of more likely scenarios that you’d need to worry about first before getting down to this scenario.

3. You can run a tape out of a burning building

Say you’ve bought the “tape is dead” argument, and all your backups are in either on a VTL, a standard array for disk backup, or some multi-cluster centralised storage system (e.g., a RAIN as per Avamar). But you’re a small site comparatively, and so you have to buy the replication system in a future budget.

Then your datacentre catches on fire. Good luck with grabbing your array or cluster of backup servers and running out the building with them. On the other hand, that nearby company that also caught fire but stuck with tape had their administrator snatch last night’s backup tape out of the library and run out of their building.

Sure, the follow up response is that you should have replicated VTLs or replicated arrays or replicated dedupe clusters, etc., but it’s not uncommon to see smaller sites buy into the “tape is dead” solution and not do any replication – planning to get budget for it in say, the second year or deployment, or when that colocation datacentre goes ahead in the “sometime later” timeframe.

4. Tapes have better offline bandwidth

Need to get a considerable amount of data (e.g., hundreds of terabytes, maybe even petabytes) from one location to other? Unless you can stream data across your links at hundreds of megabytes per second (still a while away for any reasonable corporate entity), you’re going to have better luck with sending your data via tapes rather than disks. Lighter and more compact than disks, let alone disk arrays, your capacity per cubic metre is going to be considerably higher with tape than it is with disk.

Think I’m joking? Let’s look at the numbers. Say you’ve got a cubic metre of shipping space available, let’s see which option – tape or disk – gives you the most capacity.

An LTO cartridge is 10.2cm x 10.54cm x 2.15cm. That means in 100cm x 100cm x 100cm, you can fit 9 x 9 x 46 cartridges, which comes to a grand total of 3,726 units of media. Using LTO-5 for our calculations, that’s a native capacity of 5,589 TB per cubic metre. Of course, that’s without squeezing additional media in the remaining space, but I’ll leave that up to people with more math skills than I.

A typical 3.5″ internal form-factor drive (using the 1.5TB Seagate Barracuda drive for comparison) is 10.2cm x 14.7cm x 2.6cm. In a cubic metre, you’ll fit 9 x 6 x 38 disk drives, or 2,052 drives. Using 2TB drives (currently the highest capacity), you’ll get 4,104 TB per cubic metre.

So on the TB per cubic metre front, tape wins by almost 1,500 TB.

Looking at weight – we start to see some big differences here too. The average LTO cartridge (using LTO-4 from IBM as our “average”) is 200 grams. A cubic metre of them will be 745.2 KG. That Seagate Barracuda I quoted before though weighs in at 920 grams – so for a cubic metre of disk drive capacity, you’re looking at 1,887.4 KG. There’s a tonne of difference there!

Tape wins on that sort of high capacity offline data transfer without a doubt.

5. Storage capacity of a tape system is not limited by physical datacentre footprint

If you’ve got a disk array, there’s an absolute limit to how much data you can store in it that (as much as anything) is determined by its physical footprint. If you fill it and need to add more storage, you need to expand its footprint.

Tape? Remove some cartridges, put some more in. Your offline physical footprint will grow of course – but if we’re talking datacentres, we’re talking a real, tangible cost per cubic metre of space. Your tape library of course will occupy a certain amount of space, but its storage capabilities are practically limitless regardless of its size, since all you have to do is pull full media out and replace it with empty media. Offline storage space will usually cost up to an order of magnitude less than datacentre space, so disk arrays just can’t keep up on this front.

Reasons 6 through 10 will be published soon.

© 2012 The NetWorker Blog Suffusion theme by Sayontan Sinha