Something that continues to periodically come up is the need to remind people running manual staging to ensure they specify both the SSID and the Clone ID when they stage. I did some initial coverage of this when I first started the blog, but I wanted to revisit and demonstrate exactly why this is necessary.
The short version of why is simple: If you stage by SSID alone, NetWorker will delete/purge all instances of the saveset other than the one you just created. This is Not A Good Thing for 99.999% of what we do within NetWorker.
So to demonstrate, here’s a session where I:
- Generate a backup
- Clone the backup to tape
- Stage the saveset only to tape
In between each step, I’ll run mminfo to get a dump of what the media database says about saveset availability.
Part 1 – Generate the Backup
Here’s a very simple backup for the purposes of this demonstration, and the subsequent mminfo command to find out about the backup:
[root@tara ~]# save -b Default -LL -q /etc save: /etc 106 MB 00:00:07 2122 files completed savetime=1258093549 [root@tara ~]# mminfo -q "client=tara.pmdg.lab,name=/etc" -r volume,ssid,cloneid, savetime volume ssid clone id date Default.001 2600270829 1258093549 11/13/2009 Default.001.RO 2600270829 1258093548 11/13/2009
There’s nothing out of the ordinary here, so we’ll move onto the next step.
Part 2 – Clone the Backup
We’ll just do a manual clone to the Default Clone pool. Here we’ll specify the saveset ID alone, which is fine for cloning – but is often what leads people to being in the habit of not specifying a particular saveset instance. I’m using very small VTL tapes, so don’t be worried that in this case I’ve got a clone of /etc spanning 3 volumes:
[root@tara ~]# nsrclone -b "Default Clone" -S 2600270829 [root@tara ~]# mminfo -q "client=tara.pmdg.lab,name=/etc" -r volume,ssid,cloneid, savetime volume ssid clone id date 800843S3 2600270829 1258094164 11/13/2009 800844S3 2600270829 1258094164 11/13/2009 800845S3 2600270829 1258094164 11/13/2009 Default.001 2600270829 1258093549 11/13/2009 Default.001.RO 2600270829 1258093548 11/13/2009
As you can see there, it’s all looking fairly ordinary at this point – nothing surprising is going on at all.
Part 3 – Stage by Saveset ID Only
In this next step, I’m going to stage by saveset ID alone rather than specifying the saveset ID/clone ID, which is the correct way of staging, so as to demonstrate what happens at the conclusion of the staging. I’ll be staging to a pool called “Big”:
[root@tara ~]# nsrstage -b Big -v -m -S 2600270829 Obtaining media database information on server tara.pmdg.lab Parsing save set id(s) Migrating the following save sets (ids): 2600270829 5874:nsrstage: Automatically copying save sets(s) to other volume(s) Starting migration operation... Nov 13 17:34:00 tara logger: NetWorker media: (waiting) Waiting for 1 writable volume(s) to backup pool 'Big' disk(s) or tape(s) on tara.pmdg.lab 5884:nsrstage: Successfully cloned all requested save sets 5886:nsrstage: Clones were written to the following volume(s): BIG991S3 6359:nsrstage: Deleting the successfully cloned save set 2600270829 Successfully deleted original clone 1258093548 of save set 2600270829 from media database. Successfully deleted AFTD's companion clone 1258093549 of save set 2600270829 from media database with 0 retries. Successfully deleted original clone 1258094164 of save set 2600270829 from media database. Recovering space from volume 4294740163 failed with the error 'Cannot access volume 800844S3, please mount the volume or verify its label.'. Refer to the NetWorker log for details. 6330:nsrstage: Cannot access volume 800844S3, please mount the volume or verify its label. Completed recover space operation for volume 4177299774 Refer to the NetWorker log for any failures. Recovering space from volume 4277962971 failed with the error 'Cannot access volume 800845S3, please mount the volume or verify its label.'. Refer to the NetWorker log for details. 6330:nsrstage: Cannot access volume 800845S3, please mount the volume or verify its label. Recovering space from volume 16550059 failed with the error 'Cannot access volume 800843S3, please mount the volume or verify its label.'. Refer to the NetWorker log for details. 6330:nsrstage: Cannot access volume 800843S3, please mount the volume or verify its label.
You’ll note there’s a bunch of output there about being unable to access the clone volumes the saveset was previously cloned to. When we then check mminfo, we see the consequences of the staging operation though:
[root@tara ~]# mminfo -q "client=tara.pmdg.lab,name=/etc" -r volume,ssid,cloneid, savetime volume ssid clone id date BIG991S3 2600270829 1258095244 11/13/2009
As you can see – no reference to the clone volumes at all!
Now, has the clone data been erased? No, but it has been removed from the media database, meaning you’d have to manually scan the volumes back in order to be able to use them again. Worse, if those volumes only contained clone data that was subsequently removed from the media database, they may become eligible for recycling and get re-used before you notice what has gone wrong!
Wrapping Up
Hopefully the above session will have demonstrated the danger of staging by saveset ID alone. If instead of staging by saveset ID we staged by saveset ID and clone ID, we’d have had a much more desirable outcome. Here’s a (short) example of that:
[root@tara ~]# save -b Default -LL -q /tmp save: /tmp 2352 KB 00:00:01 67 files completed savetime=1258094378 [root@tara ~]# mminfo -q "name=/tmp" -r volume,ssid,cloneid volume ssid clone id Default.001 2583494442 1258094378 Default.001.RO 2583494442 1258094377 [root@tara ~]# nsrclone -b "Default Clone" -S 2583494442 [root@tara ~]# mminfo -q "name=/tmp" -r volume,ssid,cloneid volume ssid clone id 800845S3 2583494442 1258095244 Default.001 2583494442 1258094378 Default.001.RO 2583494442 1258094377 [root@tara ~]# nsrstage -b Big -v -m -S 2583494442/1258094377 Obtaining media database information on server tara.pmdg.lab Parsing save set id(s) Migrating the following save sets (ids): 2583494442 5874:nsrstage: Automatically copying save sets(s) to other volume(s) Starting migration operation... 5886:nsrstage: Clones were written to the following volume(s): BIG991S3 6359:nsrstage: Deleting the successfully cloned save set 2583494442 Successfully deleted original clone 1258094377 of save set 2583494442 from media database. Successfully deleted AFTD's companion clone 1258094378 of save set 2583494442 from media database with 0 retries. Completed recover space operation for volume 4177299774 Refer to the NetWorker log for any failures. [root@tara ~]# mminfo -q "name=/tmp" -r volume,ssid,cloneid volume ssid clone id 800845S3 2583494442 1258095244 BIG991S3 2583494442 1258096324
The recommendation that I always make is that you forget about using saveset IDs alone unless you absolutely have to. Instead, get yourself into the habit of always specifying a particular instance of a saveset ID via the “ssid/cloneid” option. That way, if you do any manual staging, you won’t wipe out access to data!
VERY relevant to what I’ve been recently working on. Thanks for the detailed info!! Your effort and sharing of this info (and all Networker info really) is greatly appreciated.
Unless I missed it somewhere in this article, you previously address this topic in a forum where you also recommend using the clone ID of the read only volume (“Default.001.RO” in the above example) of an AFTD/adv_file. You use the clone ID in your above example, but don’t point out this recommendation. I wanted to commend on this in case anyone reads or re-reads this article (as I often do) and miss out on this important aspect. Thanks again!!
Preston,
New blog format looks good. In regards to cloning, I’ve been doing a little experimenting since we just rolled out a VTL. I just noticed that nsrclone doesn’t clone the SSIDs in the order you give it. For instance, I would like to clone the oldest savesets on our VTL first so that I can migrate them off.
For instance:
mminfo -ot -q “pool=VTL1,copies=1,!incomplete,savetime output to a file $FILE
Then I will clone the ssids
nohup nsrclone -b “$POOL” -y “$RETENTION” -J “$NODE” -S -f $FILE &
I was surprised to see savesets were being cloned from all sorts of different dates. Not what I wanted.
Hi Brett,
Thanks for the feedback. Over time I’m hoping to make a lot more information and resources for NetWorker available on the website.
I think you’ll find that NetWorker applies its own sorting algorithm to any clone operation involving multiple savesets. This is done to optimise the order in which savesets will be read from tape. In general the only way to actually force NetWorker to clone in a particular order is to give it, other than doing it one saveset at a time.
I’d need to test, but I suspect if you were to break up by volume, you’d have better results at cloning in the age you want to. Not ideal, but more controllable.
Cheers,
Preston.
my pseudo code seemed to have been chopped
mminfo -ot -q “pool=VTL1,copies=1,!incomplete,savetime output to a file $FILE
Then I will clone the ssids
nohup nsrclone -b “$POOL” -y “$RETENTION” -J “$NODE” -S -f $FILE &