{"id":1192,"date":"2009-10-16T05:55:32","date_gmt":"2009-10-15T19:55:32","guid":{"rendered":"http:\/\/nsrd.wordpress.com\/?p=1192"},"modified":"2009-10-16T05:55:32","modified_gmt":"2009-10-15T19:55:32","slug":"staging-and-connectivity-loss","status":"publish","type":"post","link":"https:\/\/nsrd.info\/blog\/2009\/10\/16\/staging-and-connectivity-loss\/","title":{"rendered":"Staging and Connectivity Loss"},"content":{"rendered":"<p>For a while now I&#8217;ve been working with EMC support on an issue that&#8217;s only likely to strike sites that have intermittent connectivity between the server and storage nodes <em>and<\/em> that stage from ADV_FILE on the storage node to ADV_FILE on the server.<\/p>\n<p>The crux of the problem is that if you&#8217;re staging from storage node to server <em>and<\/em> comms between the sites are lost for long enough that NetWorker:<\/p>\n<ul>\n<li>Detects the storage node <em>nsrmmd<\/em> processes have failed, <em>and<\/em><\/li>\n<li>Attempts to restart the storage node <em>nsrmmd<\/em> processes, <em>and<\/em><\/li>\n<li>Fails to restart the storage node <em>nsrmmd<\/em> processes<\/li>\n<\/ul>\n<p>Then you can end up in a situation where the staging aborts in an &#8216;interesting&#8217; way. The first hint of the problem is that you&#8217;ll see a message such as the following in your daemon.raw:<\/p>\n<p>68975 10\/15\/2009 09:59:05 AM\u00a0 2 0 0 526402000 4495 0 tara.pmdg.lab nsrmmd <strong>filesys_nuke_ssid<\/strong>: unable to unlink \/backup\/84\/05\/notes\/c452f569-00000006-fed6525c-4ad6525c-00051c00-dfb3d342 on device `\/backup&#8217;: No such file or directory<\/p>\n<p>(The above was rendered for your convenience.)<\/p>\n<p>However, if you look for the cited file, you&#8217;ll find that it doesn&#8217;t exist. That&#8217;s not quite the end of the matter though. Unfortunately, while the saveset file that was being staged <em>didn&#8217;t<\/em> stay on disk, its media database details <em>did<\/em>. So in order to restart staging, it becomes necessary to first locate the saveset in question and delete the media database entry for the (failed) server disk backup unit copy. Interestingly, this is only ever to be found on the RW device, not the RO device:<\/p>\n<pre>[root@tara ~]# mminfo -q \"ssid=c452f569-00000006-fed6525c-4ad6525c-00051c00-dfb3d342\"\n volume\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 client\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 date\u00a0\u00a0\u00a0\u00a0\u00a0 size\u00a0\u00a0 level\u00a0 name\nTara.001\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 fawn\u00a0\u00a0\u00a0\u00a0\u00a0 10\/15\/2009 1287 MB manual\u00a0 \/usr\/share\nFawn.001\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 fawn\u00a0\u00a0\u00a0\u00a0\u00a0 10\/15\/2009 1287 MB manual\u00a0 \/usr\/share\nFawn.001.RO\u00a0\u00a0\u00a0 fawn\u00a0\u00a0\u00a0\u00a0\u00a0 10\/15\/2009 1287 MB manual\u00a0 \/usr\/share<\/pre>\n<p>We had hoped that it was fixed in 7.5.1.5, but my tests aren&#8217;t showing that to be the case. Regardless, it&#8217;s certainly around in 7.4.x as well and (given the nature of it) has quite possibly been around for a while longer than that.<\/p>\n<p>As I said at the outset, this isn&#8217;t likely to affect many sites, but it is something to be aware of.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For a while now I&#8217;ve been working with EMC support on an issue that&#8217;s only likely to strike sites that&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[11,16],"tags":[228,245,383,668,878,933,941],"class_list":["post-1192","post","type-post","status-publish","format-standard","hentry","category-features","category-networker","tag-cloning","tag-connectivity","tag-filesys_nuke_ssid","tag-nsrmmd","tag-server","tag-staging","tag-storage-node"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pKpIN-je","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/1192","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/comments?post=1192"}],"version-history":[{"count":0,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/1192\/revisions"}],"wp:attachment":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/media?parent=1192"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/categories?post=1192"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/tags?post=1192"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}