{"id":1924,"date":"2010-03-03T09:37:21","date_gmt":"2010-03-02T23:37:21","guid":{"rendered":"http:\/\/nsrd.info\/blog\/?p=1924"},"modified":"2010-03-03T09:37:21","modified_gmt":"2010-03-02T23:37:21","slug":"adv_file-devices-and-tape-rotation-strategies","status":"publish","type":"post","link":"https:\/\/nsrd.info\/blog\/2010\/03\/03\/adv_file-devices-and-tape-rotation-strategies\/","title":{"rendered":"ADV_FILE devices and tape rotation strategies"},"content":{"rendered":"<p>While I touched on this in the second blog posting I made (<a title=\"Instantiating Savesets\" href=\"https:\/\/nsrd.info\/blog\/2009\/01\/25\/instantiating-savesets\/\" target=\"_blank\">Instantiating Savesets<\/a>), it&#8217;s worthwhile revisiting this topic more directly.<\/p>\n<p>Using ADV_FILE devices can play havoc with conventional tape rotation strategies; if you aren&#8217;t aware of these implications, it could cause operational challenges when it comes time to do recovery from tape. Let&#8217;s look at the lifecycle of a saveset in a disk backup environment where a conventional setup is used. It typically runs like this:<\/p>\n<ol>\n<li>Backup to disk<\/li>\n<li>Clone to tape<\/li>\n<li>(Later) Stage to tape<\/li>\n<li>(At rest) 2 copies on tape<\/li>\n<\/ol>\n<ul><\/ul>\n<p>Looking at each stage of this, we have:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r1_adv_file.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1933\" title=\"Saveset on ADV_FILE device\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r1_adv_file.jpg\" alt=\"Saveset on ADV_FILE device\" width=\"268\" height=\"179\" \/><\/a>The saveset, once written to an ADV_FILE volume, has two instances. The instance recorded as being on the read-read only part of the volume will have an SSID\/CloneID of X\/Y. The instance recorded as being on the read-write part of the volume will have an SSID\/CloneID of X\/Y+1. This higher CloneID is what causes NetWorker, upon a recovery request, to seek the &#8220;instance&#8221; on the read-only volume. Of course, there&#8217;s only one actual instance (hence why I <a title=\"Earth to EMC NetWorker Engineering: Wake Up\" href=\"https:\/\/nsrd.info\/blog\/2010\/01\/23\/earth-to-emc-networker-engineering-wake-up\/\" target=\"_blank\">object so strongly to the &#8216;validcopies&#8217; field<\/a> introduced in 7.6 reporting 2) \u2013 the two instances reported are &#8220;smoke and mirrors&#8221; to allow simultaneous backup to and recovery from an ADV_FILE volume.<\/p>\n<p>The next stage sees the saveset cloned:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r2_adv_file_clone.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1934\" title=\"ADV_FILE + Tape Clone\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r2_adv_file_clone.jpg\" alt=\"ADV_FILE + Tape Clone\" width=\"453\" height=\"179\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r2_adv_file_clone.jpg 453w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r2_adv_file_clone-300x118.jpg 300w\" sizes=\"auto, (max-width: 453px) 100vw, 453px\" \/><\/a>This leaves us with 3 &#8216;instances&#8217; &#8211; 2 physical, one virtual. Our SSID\/CloneIDs are:<\/p>\n<ul>\n<li>ADV_FILE read-only: <strong>X\/Y<\/strong><\/li>\n<li>ADV_FILE read-write: <strong>X\/Y+1<\/strong><\/li>\n<li>Tape: <strong>X\/Y+<em>n<\/em><\/strong>, where <em>n<\/em> &gt; 1.<\/li>\n<\/ul>\n<p>At this point, any recovery request will <em>still<\/em> call for the instance on the read-only part of the ADV_FILE volume, so as to help ensure the fastest recovery initiation.<\/p>\n<p>At some future point, as disk capacity starts to run out on the ADV_FILE device, the saveset will typically be staged out:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r3_adv_file_stage.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1935\" title=\"ADV_FILE staging to tape\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r3_adv_file_stage.jpg\" alt=\"ADV_FILE staging to tape\" width=\"466\" height=\"179\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r3_adv_file_stage.jpg 466w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r3_adv_file_stage-300x115.jpg 300w\" sizes=\"auto, (max-width: 466px) 100vw, 466px\" \/><\/a>At the conclusion of the staging operation, the physical + virtual instances of the saveset on the ADV_FILE device are removed, leaving us with:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r4_tape_tape.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1936\" title=\"Savesets on tape only\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r4_tape_tape.jpg\" alt=\"Savesets on tape only\" width=\"453\" height=\"132\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r4_tape_tape.jpg 453w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2010\/03\/r4_tape_tape-300x87.jpg 300w\" sizes=\"auto, (max-width: 453px) 100vw, 453px\" \/><\/a><\/p>\n<p>So, at this point, we end up with:<\/p>\n<ul>\n<li>A saveset instance on a <em>clone<\/em> volume with SSID\/CloneID of: <strong>X\/Y+<em>n<\/em><\/strong>.<\/li>\n<li>A saveset instance on (typically) a non-clone volume with SSID\/CloneID of: <strong>X\/Y+<em>n<\/em>+<em>m<\/em><\/strong>, where <em>m<\/em> &gt; 0.<\/li>\n<\/ul>\n<p>So, where does this leave us? (Or if you&#8217;re not sure where I&#8217;ve been heading yet, you may be wondering what point I&#8217;m actually trying to make.)<\/p>\n<p>Note what I&#8217;ve been saying each time \u2013 NetWorker, when it needs to read from a saveset for recovery purposes, will want to pick the saveset instance with the <em>lowest<\/em> CloneID. At the point where we&#8217;ve got a clone copy and a staged copy, both on tape, the <em>clone<\/em> copy will have the lowest CloneID.<\/p>\n<p>The net result is that NetWorker will, in these circumstances, when both tapes aren&#8217;t online, request the <em>clone<\/em> volume for recovery \u2013 even though in an extreme number of cases, this will be the volume that&#8217;s offsite.<\/p>\n<p>For NetWorker versions 7.3.1 and lower, there was only one solution to this \u2013 you had to hunt down the actual clone saveset instances NetWorker was asking for, mark them as suspect, and reattempt the recovery. If you managed to mark them all as suspect, then you&#8217;d be able to &#8216;force&#8217; NetWorker into facilitating the recovery from the volume(s) that had been staged to. However, after the recovery you had to make sure you backed out of those changes, so that both the clones and the staged copies would be considered not-suspect.<\/p>\n<p>Some companies, in this situation, would instigate a tape rotation policy such that clone volumes would be brought back from off-site <em>before<\/em> savesets were likely to be staged out, with subsequently staged media sent offsite. This has a dangerous side-effect of temporarily leaving <em>all<\/em> copies of backups on-site, jeapordising disaster recovery situations, and hence it&#8217;s something that I couldn&#8217;t in any way recommend.<\/p>\n<p>The solution introduced around 7.3.2 however is far simpler \u2013 a mminfo flag called <em>offsite<\/em>. This isn&#8217;t to be confused with the convention of setting a volume location field to &#8216;offsite&#8217; when the media is removed from site. Annoyingly, this remains unqueryable; you can set it, and NetWorker will use it, but you can&#8217;t say, search for volumes with the &#8216;offsite&#8217; flag set.<\/p>\n<p>The offsite flag has to be manually set, using the command:<\/p>\n<pre># nsrmm -o offsite volumeName<\/pre>\n<p>(where <em>volumeName<\/em> typically equals the barcode).<\/p>\n<p>Once this is set, then NetWorker&#8217;s standard saveset (and therefore volume) selection criteria is subtly adjusted. Normally if there are no online instances of a saveset, NetWorker will request the saveset with the lowest CloneID. However, saveset instances that are on volumes with the <em>offsite<\/em> flag set <em>will be deemed ineligible<\/em> and NetWorker will look for a saveset instance that isn&#8217;t flagged as being offsite.<\/p>\n<p>The net result is that when following a traditional backup model with ADV_FILE disk backup (backup to disk, clone to tape, stage to tape), it&#8217;s very important that tape offsiting procedures be adjusted to set the <em>offsite<\/em> flag on clone volumes as they&#8217;re removed from the system.<\/p>\n<p>The good news is that you don&#8217;t normally have to do anything when it&#8217;s time to pull the tape back onsite. The flag is automatically cleared* for a volume as soon as it&#8217;s put back into an autochanger and detected by NetWorker. So when the media is recycled, the flag will be cleared.<\/p>\n<p>If you come from a long-term NetWorker site and the convention is still to mark savesets as suspect in this sort of recovery scenario, I&#8217;d suggest that you update your tape rotation policies to instead use the <em>offsite<\/em> flag. If on the other hand, you&#8217;re about to implement an ADV_FILE based backup to disk policy, I&#8217;d strongly recommend you plan in advance to configure a tape rotation policy that uses the <em>offsite<\/em> flag as cloned media is sent away from the primary site.<\/p>\n<p>&#8212;<br \/>\n* If you did need to explicitly clear the flag, you can run:<\/p>\n<pre># nsrmm -o notoffsite volumeName<\/pre>\n<p>Which would turn the flag back off for the given <em>volumeName<\/em>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>While I touched on this in the second blog posting I made (Instantiating Savesets), it&#8217;s worthwhile revisiting this topic more&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[3,16,17,20],"tags":[102,226,656,677,932,984],"class_list":["post-1924","post","type-post","status-publish","format-standard","hentry","category-architecture","category-networker","category-policies","category-scripting","tag-adv_file","tag-clone","tag-nsrclone","tag-nsrstage","tag-stage","tag-tape-rotation"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pKpIN-v2","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/1924","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/comments?post=1924"}],"version-history":[{"count":0,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/1924\/revisions"}],"wp:attachment":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/media?parent=1924"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/categories?post=1924"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/tags?post=1924"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}