The virtue of persistence

I have to say, I’m really liking NetWorker’s ability to work with persistent binding on tape libraries now.

If you’re not aware of persistent binding, it exists to resolve a problem whereby on some platforms (such as Windows and Linux), device re-ordering can happen across reboots. For a long time there’s been ways of preventing this from being an issue for filesystems/LUNs – for instance, on Linux most filesystem types support mounting a filesystem via a unique label or UUID. For example, this is a typical entry from /etc/fstab on a CentOS install:

LABEL=/boot     /boot       ext3    defaults        1 2

This allows the OS to mount the /boot filesystem regardless of whether it’s on /dev/sda, /dev/sdb, /dev/whatever.

Persistent binding allows us to do the same thing with tape as the above does with disk. The advantage of this is obvious: when NetWorker configures a tape library, it maps element order to device paths –

[root@linuxvtl ~]# sjisn 3.0.0
Serial Number data for 3.0.0 (SPECTRA  PYTHON):
	Library:
		Serial Number: XYZZY_A   
		SCSI-3 Device Identifiers:
			ATNN=SPECTRA PYTHON
			WWNN=50223344AB000000
	Drive at element address 1:
		SCSI-3 Device Identifiers:
			ATNN=IBM     ULT3580-TD1  XYZZY_A1  
	Drive at element address 2:
		SCSI-3 Device Identifiers:
			ATNN=IBM     ULT3580-TD1  XYZZY_A2  
	Drive at element address 3:
		SCSI-3 Device Identifiers:
			ATNN=IBM     ULT3580-TD1  XYZZY_A3  
	Drive at element address 4:
		SCSI-3 Device Identifiers:
			ATNN=IBM     ULT3580-TD1  XYZZY_A4

If you can’t see the mappings there, don’t worry – we can see that via inquire – for example:

scsidev@3.1.0:IBM   ULT3580-TD1   550V|Tape, /dev/nst0
	S/N:	XYZZY_A1  
	ATNN=IBM     ULT3580-TD1     XYZZY_A1  
	WWNN=50223344AB000100

That information works well for conventional situations where there’s no risk of a device re-ordering.

When there’s a risk of re-ordering however, the above style of configuration doesn’t work – or if it does, it doesn’t work across reboots. To avoid it in its entirety, we instead do a configuration that uses more exact identification details. Typically, this is in a fibre-channel scenario, and that means WWNs.

We can access device details referencing WWNs via the persistent binding mode – inquire -p, and jbconfig -p.

Let’s look first at inquire -p:

[root@linuxvtl ~]# inquire -p
scsidev@3.0.0:SPECTRA PYTHON    |Autochanger (Jukebox),
/dev/tape/by-id/scsi-350223344ab000000
		S/N:	XYZZY_A   
		ATNN=SPECTRA PYTHON
		WWNN=50223344AB000000
scsidev@3.1.0:IBM   ULT3580-TD1 550V|Tape,
/dev/tape/by-id/scsi-350223344ab000100-nst
		S/N:	XYZZY_A1  
		ATNN=IBM     ULT3580-TD1     XYZZY_A1  
		WWNN=50223344AB000100

If we run jbconfig -p, the output and run-scenario looks a little different, because it’s referencing WWN-based paths rather than standard /dev/nst* paths:

[root@linuxvtl ~]# jbconfig -p

Jbconfig is running on host linuxvtl (Linux 2.6.18-128.el5),
  and is using linuxvtl as the NetWorker server.

	 1) Configure an AlphaStor Library.
	 2) Configure an Autodetected SCSI Jukebox.
	 3) Configure an Autodetected NDMP SCSI Jukebox.
	 4) Configure an SJI Jukebox.
	 5) Configure an STL Silo.

What kind of Jukebox are you configuring? [1] 2
14484:jbconfig: Scanning SCSI buses; this may take a
while ... 
These are the SCSI Jukeboxes currently attached to your
system:
  1) 350223344ab000000: Spectralogic
  2) 350223344ab000800: Spectralogic
Which one do you want to install? 1
Installing 'Spectralogic' jukebox - 350223344ab000000.

What name do you want to assign to this jukebox device? VTL1
15814:jbconfig: Attempting to detect serial numbers on the
jukebox and drives ...

15815:jbconfig: Will try to use SCSI information returned by
jukebox to configure drives.

Turn NetWorker auto-cleaning on (yes / no) [yes]? no

The following drive(s) can be auto-configured in this
jukebox:
 1> LTO Ultrium @ 3.1.0 ==>
/dev/tape/by-id/scsi-350223344ab000100-nst
 2> LTO Ultrium @ 3.2.0 ==>
/dev/tape/by-id/scsi-350223344ab000200-nst
 3> LTO Ultrium @ 3.3.0 ==>
/dev/tape/by-id/scsi-350223344ab000300-nst
 4> LTO Ultrium @ 3.4.0 ==>
/dev/tape/by-id/scsi-350223344ab000400-nst
These are all the drives that this jukebox has reported.

To change the drive model(s) or configure them as
shared or NDMP drives, 
 you need to bypass auto-configure. Bypass
auto-configure? (yes / no) [no] no

Jukebox has been added successfully
...

Once a library has been configured with persistent binding, the device access paths logically become different. On Windows, you’ll get device path names of \.TapeX where X starts at something along the lines of 2^31-X; on Linux, the paths will vary depending on the install – for instance, CentOS may give a different result than Oracle Unbreakable Linux, etc. The device paths on Linux as well will explicitly reference the WWNs:

[root@linuxvtl ~]# nsrjb -v
<snip>
drive 1 (/dev/tape/by-id/scsi-350223344ab000100-nst) slot :   
drive 2 (/dev/tape/by-id/scsi-350223344ab000200-nst) slot :   
drive 3 (/dev/tape/by-id/scsi-350223344ab000300-nst) slot :   
drive 4 (/dev/tape/by-id/scsi-350223344ab000400-nst) slot :
While this makes referencing individual tape drives a little more fiddly, it has the distinct advantage that across multiple reboots, the library remains fully operable and all devices accessible – a very, very small price to pay. There is, indeed, virtue in persistence. If you want to read more about NetWorker and persistent binding, check out the whitepaper about it available on PowerLink.

5 thoughts on “The virtue of persistence”

  1. With Linux 2.6 kernel you don’t have a choice. You have to use persistent naming provided by udev. Recent hba drivers don’t support persistent binding via assigning a target number to a target port anymore. In recent Linux distros there’s no need to write own udev rules.

  2. I’m just facing a funny different problem with persistent binding – dual attached FC drives…. just curious if you know..
    – does it make sense to make DDS between 2 FC connected drives on one storage node? It _works_, however it seems to me that if there is a path failure and the storage node tries to use the ‘bad’ path, the drives fails to Service mode, however it remains reserved and will be unusable anyway.. or does it get ‘unreserved’ automatically upon entering service mode? I don’t think so..
    – how do you make it so that you won’t use just one of the FC ports on your server… the RedHat udev has a buggy settings, so you don’t get tape/by-path (which are horribly illegible anyway) – I am thinking of making my own rules that might use port wwn instead of serial number – this way I would get 2 such devices in /dev/tape/by-id and I could combine them to use both host FC ports. Do you have any other idea how to do it?

    1. Hi Andy,

      Do you mean in the first instance to use some form of DDS between the two paths to a single FC connected drive?

      I’ve generally found NetWorker (and for that matter, a lot of other backup products) to not cope very well with multipathed drives; I’d instead design a solution that doesn’t depend on absolutely getting to any one particular drive.

      In relation to the second question – that’s going to be a zoning or access grouping situation. If each drive has an independent WWN, then at the SAN level, zone particular drives to see only one initiator on the host, and zone the other drives to the see the other initiator. On Data Domain, you can instead do this with access groups in the VTL.

      Cheers,
      Preston.

  3. Yes, that’s what I mean – DDS between 2 paths on one storage node. There’s actually no reason why it shouldn’t work (and as far as I tested, it works) – and it’s the right way to do it (OS multipathing is close to impossible for streaming devices). However, I suspect there is no benefit – how does NetWorker cope with failed path to some DDS device on some storage node? Are other storage nodes able to continue using the device or is administrator action required?

    As for zoning…each drive has 2 FC ports, each of them having separate port wwn and they probably share common node wwn. We are running 4 IBM 3592 Gen-3 drives (times 2 – we have 2 libraries) and I would probably saturate one FC port if I try to run it all over one port – the DDS would make this randomized which would be nice again…… ok, I could probably ask the customer to put half of the drives on one FC port and the other half on the second… this would defy the dual-port tape drives idea, so I don’t quite like it. Will see what to do 🙂

  4. better to call it “persistent naming” to differentiate between persistent binding on hba level. The EMC Networker whitepaper also jumps from binding to naming back and forth.

    Persistent naming as implemented in the IBM ultrium drivers helped tremendously in windows environments, where persistent binding on hba level in the same environment still could not handle san or drive issues when more than 1 drives was zoned to a system still causing device files to change their name during reboots or scan for new hardware…

    Good thing that Networker incorporates persistent naming also, however the standard naming conventions do not communicate that well. nst0 becomes scsi-350223344ab000100-nst on linux or \.Tape0 becomes \.Tape2147483646 onw windows.

    We’re on the verge of starting using the IBM driver persistent naming on AIX also where you have the freedom to rename tape drives device file to any name that you desire.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.