Debugging autochanger operations

Having a strong support background, and having a preference for working in software rather than hardware, when I use NetWorker’s nsrjb utility with tape libraries I like to get as much feedback as possible on operations.

When performing command line operations with nsrjb, this is relatively straight forward – just add on a few “-v” options to the operation. Indeed, my customers sometimes probably tire of hearing me start every nsrjb command with:

# nsrjb -vvv

It’s become such a force of habit that I literally don’t consider running it without those three verbose flags any more*. If you’re not sure why you would do this, or have never run nsrjb with verbose flags, you should go and do it – you get a lot more information about the progress of the operation, which can actually be very important if you’re debugging or otherwise trying to resolve hanging hardware operations.

Being able to run nsrjb in verbose mode is all well and good, but it doesn’t help you diagnose periodic or random failures if they’re only happening during backup operations. To keep track of progress on these, you need to use a setting within the jukebox resource – the debug trace level option.

This updates the daemon.raw log file with extra details about autochanger operations, in a similar way to the additional details you get when you run nsrjb with additional verbosity flags. It can be set to a number, and like the debug option for a lot of NetWorker commands, defaults to 0 (for no additional information), and can be set to ever larger numbers that generate a lot of extra information. It doesn’t generate the same information as running nsrjb with multiple “-v” flags, but it still helps.

Now, before I go further, I’d suggest that this is probably something you shouldn’t leave enabled all the time. That is, if you’re having an intermittent problem with a tape library, consider turning this on so that you have the option of reviewing log files from overnight backups to see if there’s any relevant/helpful information from the time of failures. However, when you finish your debugging, turn it back off by setting it to zero.

There’s two ways you can set it – either from the command line, using nsradmin, or from NetWorker Management Console when viewing a jukebox resource with “diagnostic” details turned on.

Let’s look at nsradmin first:

# nsradmin -s nox
NetWorker administration program.
Use the "help" command for help, "visual" for full-screen mode.
nsradmin> show name:; debug trace level:
nsradmin> print type: NSR jukebox
                        name: LTO1_LIB;
           debug trace level: 0;
nsradmin> update debug trace level: 9
           debug trace level: 9;
Update? y
updated resource id 31.0.255.110.0.0.0.0.60.6.146.73.0.0.0.0.10.0.0.1(5348)
nsradmin> print
                        name: LTO1_LIB;
           debug trace level: 9;

It’s as simple as that!

If you want to see it from within NMC, you’d make the changes in the “Debug Trace Level” field, as per the screen-shot below:

Advanced tab of Autochanger properties
Advanced tab of Autochanger properties

(The debug trace level setting is the bottom-most option on the left-hand side of this tab.)

You’ll find that debug level of 9 will more than likely be more than enough for any regular fault finding exercise, as the amount of information produced by this is quite high. For instance, at the bottom of this post I’ve got output from a single operation, “nsrjb -lvvv -S 1” to the volume in slot one into a drive.

Remember when you’re done of course to turn the trace level back off by setting the debug trace level to zero.

nox nsrlcpd getenv_ulong(NSR_LCPD_BASEELEMOFFSET=0,0:65535, 0) returning 0
nox nsrlcpd getenv_ulong(NSR_LCPD_NO_LOAD_EXP=0,0:1, 0) returning 0
nox nsrlcpd getenv_ulong(NSR_LCPD_NOVUELMASC=0,0:1, 0) returning 0
nox nsrlcpd getenv_ulong(NSR_LCPD_AUTOEJECT=0,0:1, 0) returning 0
nox nsrlcpd getenv_ulong(NSR_LCPD_AUTOEXT=0,0:1, 0) returning 0
nox nsrlcpd getenv_ulong(NSR_LCPD_AUTOEXT2=0,0:1, 0) returning 0
nox nsrlcpd getenv_ulong(NSR_LCPD_RES_DRVACCESS=0,0:1, 0) returning 0
nox nsrlcpd Event LE_CONTINUE raised for cmdid 16055
nox nsrlcpd Processing: cmdid 16055 & cmdop 12 for event: LE_CONTINUE
nox nsrlcpd Event LE_COMP_OK raised for cmdid 16055
nox nsrlcpd Responder woke up due to event
nox nsrlcpd LCPD_HDR: server:nox.anywebdb.com vers:1 jbid:LTO1_LIB cmdid:16055 is_master=yes cmdop:UPDATE.
nox nsrlcpd ncomplete :no
nox nsrlcpd n
nox nsrmmgd Loading volume `Clone.001' from slot `1' into device `/dev/nst0'.
nox nsrlcpd LCPD_HDR: server:nox.anywebdb.com vers:1 jbid:LTO1_LIB cmdid:16056 is_master=no cmdop:LOAD.
nox nsrlcpd     MOVE MEDIA INFO
nox nsrlcpd         source : S:000031
nox nsrlcpd         source attrlist : NULL
nox nsrlcpd         dest   : D:000001
nox nsrlcpd Event LE_NEWCMD raised for cmdid 16056
nox nsrlcpd Processing: cmdid 16056 & cmdop 4 for event: LE_NEWCMD
Unable to render the following message: freeing unused errinfo with msgid 14947

nox nsrlcpd Event LE_CONTINUE raised for cmdid 16056
nox nsrlcpd Processing: cmdid 16056 & cmdop 4 for event: LE_CONTINUE
nox nsrlcpd lg_lstat(): Calling lstat64().
nox nsrlcpd lg_open(): Calling open64().
nox nsrlcpd lcpd_last_status_set: OK->OK (requested OK).
nox nsrlcpd Event LE_CONTINUE raised for cmdid 16056
nox nsrlcpd Processing: cmdid 16056 & cmdop 4 for event: LE_CONTINUE
nox nsrlcpd lg_lstat(): Calling lstat64().
nox nsrlcpd lg_open(): Calling open64().
nox nsrlcpd LCPD_HDR: server:nox.anywebdb.com vers:1 jbid:LTO1_LIB cmdid:16057 is_master=no cmdop:STATUS.
nox nsrlcpd Responder woke up due to event
nox nsrlcpd LCPD_HDR: server:nox.anywebdb.com vers:1 jbid:LTO1_LIB cmdid:16057 is_master=yes cmdop:STATUS.
nox nsrlcpd pid:7606 lcp_rpc_vers:1, jb_list_len:1
nox nsrlcpd n
nox nsrlcpd lcpd_last_status_set: OK->OK (requested OK).
Unable to render the following message: freeing unused errinfo with msgid 14947

nox nsrlcpd Event LE_CONTINUE raised for cmdid 16056
nox nsrlcpd Processing: cmdid 16056 & cmdop 4 for event: LE_CONTINUE
nox nsrlcpd Event LE_COMP_OK raised for cmdid 16056
nox nsrlcpd Responder woke up due to event
nox nsrlcpd LCPD_HDR: server:nox.anywebdb.com vers:1 jbid:LTO1_LIB cmdid:16056 is_master=yes cmdop:LOAD.
nox nsrlcpd ncomplete :no
nox nsrlcpd offset:0    tag:S:000031    ok:yes full: no barcode:       ? dest:D:000001 bay:?
nox nsrlcpd n
nox nsrlcpd offset:0      tag:D:000001    ok:yes full:yes barcode:       ? source:S:000031 serial#:? bay:?
nox nsrlcpd n
nox nsrd /dev/nst0 Verify label operation in progress
nox nsrd /dev/nst0 Mount operation in progress
nox nsrlcpd LCPD_HDR: server:nox.anywebdb.com vers:1 jbid:LTO1_LIB cmdid:16058 is_master=no cmdop:STATUS.
nox nsrlcpd Responder woke up due to event
nox nsrlcpd LCPD_HDR: server:nox.anywebdb.com vers:1 jbid:LTO1_LIB cmdid:16058 is_master=yes cmdop:STATUS.
nox nsrlcpd pid:7606 lcp_rpc_vers:1, jb_list_len:1
nox nsrlcpd n
nox nsrd [Jukebox `LTO1_LIB', operation # 30]. Finished with status: succeeded
nox nsrlcpd Processing: cmdid 4294967295 & cmdop 17 for event: LE_NEWCMD
nox nsrlcpd Event ? raised for cmdid 4294967295
nox nsrlcpd Processing: cmdid 4294967295 & cmdop 17 for event: ?
nox nsrlcpd lg_lstat(): Calling lstat64().
nox nsrlcpd lg_open(): Calling open64().
nox nsrlcpd lcpd_last_status_set: OK->OK (requested OK).
nox nsrlcpd Polling current hw status jukebox:LTO1_LIB status:OK
nox nsrlcpd Event ? raised for cmdid 4294967295
nox nsrlcpd Processing: cmdid 4294967295 & cmdop 17 for event: ?
nox nsrlcpd Event LE_COMP_OK raised for cmdid 4294967295


* With the one exception being when it’s used in scripting.

debug trace level option.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.