{"id":2471,"date":"2010-09-11T16:39:50","date_gmt":"2010-09-11T06:39:50","guid":{"rendered":"http:\/\/nsrd.info\/blog\/?p=2471"},"modified":"2018-12-11T18:32:44","modified_gmt":"2018-12-11T08:32:44","slug":"backup-to-disk-and-busy-state-staging","status":"publish","type":"post","link":"https:\/\/nsrd.info\/blog\/2010\/09\/11\/backup-to-disk-and-busy-state-staging\/","title":{"rendered":"Backup to Disk and Busy State Staging"},"content":{"rendered":"<p>Backup to disk has well and truly become entrenched as a core backup strategy in most companies. By &#8220;backup to disk&#8221; I&#8217;m referring to <em>either<\/em> of ADV_FILE devices <em>or<\/em> VTLs \u2013 i.e., the general notion of backing up first to disk. For the rest of the article, since I&#8217;m feeling a little lazy today, I&#8217;ll follow industry norm and call backup to disk by the generic &#8220;B2D&#8221;.<\/p>\n<p>Now, in most companies, there&#8217;ll still be physical tape involved. Long-term backups held on sufficiently replicated storage \u2013 even with deduplication \u2013 is going to remain costly for some time to come; but once B2D appears within an organisation, one of two architecture decisions will typically occur:<\/p>\n<ol>\n<li>B2D region designed to hold a &#8220;significant&#8221; nearline capacity, where &#8220;significant&#8221; refers to a business-appropriate amount of recent backups.<\/li>\n<li>B2D region designed as a &#8220;staging&#8221; region to have <em>just enough<\/em> capacity, where &#8220;just enough&#8221; means that if data isn&#8217;t staged daily (or near-daily), staging areas will become full and backups will stop.<\/li>\n<\/ol>\n<p>Having observed B2D regions designed as staging-only on several occasions now, I&#8217;m even more firmly convinced that B2D as staging is a false economy that fails to take into consideration a few key metrics. Sure, buying say, 5TB or 10TB of disk is cheaper than buying 40TB with deduplication, but the cost of storage doesn&#8217;t end with the purchase. In fact, since the actual dollar cost of storage is typically amortised out over its expected deployment time, that cost often ends up being pretty minimal.<\/p>\n<p>There are three distinct costs that I see as evident when using B2D purely as a staging region. These are:<\/p>\n<ul>\n<li>Staff time.<\/li>\n<li>Physical wear and tear.<\/li>\n<li>Increased risk of recovery failure.<\/li>\n<\/ul>\n<p>Before I go further, I want to cover a term I used in the title of this post; &#8220;busy state staging&#8221; \u2013 it refers to environments where a significant portion of each day is spent with the B2D region being used to stage out from disk to physical tape, so as to free up room. There&#8217;s probably four key activities a backup system can be doing at any one time. These are:<\/p>\n<ul>\n<li>Backup<\/li>\n<li>Recovery<\/li>\n<li>Duplication\/Cloning<\/li>\n<li>Maintenance<\/li>\n<\/ul>\n<p>Backup, recovery and cloning are all givens; maintenance functions encompass media import\/export\/labelling, configuration activities, and most definitely includes staging. That&#8217;s right \u2013 staging is not any of backup, recovery or cloning; it falls into the category of moving data around in order to keep the system running. It&#8217;s effectively an overhead function for the environment, and as we know, the aim in any environment is to keep overheads to a minimum.<\/p>\n<p>Over the expected deployment period of the B2D region in a backup system, I&#8217;d argue that those three costs previously cited add up to enough to demonstrate that the vast majority of businesses should <em>not<\/em> deploy B2D in a staging-only configuration. Let&#8217;s consider each of them individually.<\/p>\n<h3>Staff Time<\/h3>\n<p>This is the easiest to factor in. Let&#8217;s say your backup administrator has to spend roughly an hour a day between monitoring and maintaining free capacity on a staging-only B2D region. Now add up those hours per day, per week, per year across the lifetime of a deployment, and see how much it represents based on the hourly rate of the backup administrator. Assume $40 per hour, 4 weeks annual leave a year. So that leaves 48 weeks, 5 hours per week at $40 an hour. That&#8217;s $9,600 per year of staff costs through managing a poorly provisioned B2D region.<\/p>\n<p>Usually that&#8217;s not the final cost though in staff time \u2013 my personal experience is that there&#8217;s a higher tendency in environments that use B2D for staging to need to engage temporary contractors, etc., to help fill in on projects where systems administration staff don&#8217;t have available time to do other projects in the company. So let&#8217;s assume that as a result of the backup administrator having to focus on B2D staging an hour a day the organisation has to engage a contractor one week a year to make up the short-fall. Assuming a contracting rate of $80 per hour, that&#8217;s $3,200 per year.<\/p>\n<p>Now, assuming B2D storage has been provisioned over a 3 year period, we&#8217;re adding $38,400 to the maintenance impact of a staging-only region.<\/p>\n<p>My gut feel, by the way, is that in an appropriately provisioned B2D architecture, the backup administrator will spend at most one fifth of the time in B2D storage administration; and there won&#8217;t be a need to engage contractors <em>for that reason<\/em>. So that $38,400 cost would shrink to say, $5,760 of time. In anyone&#8217;s books, that&#8217;s a good percentage saving.<\/p>\n<h3>Physical Wear and Tear<\/h3>\n<p>We&#8217;d count ourselves lucky if the only impact of using B2D in a staging configuration were staff costs. There&#8217;s more though. The wear and tear on both physical media and physical tape drives will be significantly increased, as these units will be running more frequently. Not only that, rather than having a reduced priority, the service time on physical tape is almost as critical in a tape-only environment. The net consequence is that rather than being able to say, work with a next-day service contract for the physical tape libraries, organisations are forced to stick with a 4-hour same-day response contract. As we know, there&#8217;s usually a pretty significant price difference between these types of contracts!<\/p>\n<h3>Increased Risk of Recovery Failure<\/h3>\n<p>We&#8217;d equally count ourselves lucky if the only impacts of using B2D in a staging-only configuration were just staff time and increased maintenance costs. The real insidious cost though is the risk of a recovery failure. In this, I&#8217;m not referring to any limitations that may exist around simultaneously recovering to while staging\/cloning from B2D media. What I&#8217;m referring to is the risk that a backup may not actually run in the first place because a staging region becomes full, blocking new sessions starting. When considered from a backup perspective, that may not sound a lot. Turning it around to the purpose of a backup: imagine the consequence though of that data that was never backed up being needed for a recovery. While it may be logical to say &#8220;if it can&#8217;t be backed up, then we can&#8217;t factor it into recovery requirements&#8221;, but disasters, emergencies and auditors do not come when it&#8217;s convenient for us.<\/p>\n<p>With this in mind, any backup that fails to run because a staging area is full should be considered from the full impact of a recovery SLA being breached for that data. That may sound harsh, but I&#8217;d actually suggest it&#8217;s a more business-focused rather than IT-focused approach to backup.<\/p>\n<h2>How&#8217;s that busy-state staging sounding now?<\/h2>\n<p>Enterprise data protection is one of those areas where businesses are most tempted to do cost cutting. We see it with Icarus support contracts, with inappropriate coupling of services, and we see it with B2D staging areas. We can intuit with almost no effort that busy state staging isn&#8217;t the best backup model. If your system is busy 20 hours a day between backup, cloning and maintenance functions, then it&#8217;s obvious that there&#8217;s at least an increased risk of parts failure; but the cost of the architecture is also magnified by wasted staff time, increased maintenance contract costs, and the potential failure to facilitate business-required recoveries.<\/p>\n<p>When we take all those things into consideration, architecting B2D for significant or at least appropriate nearline recovery purposes rather than just staging becomes the <em>cheaper<\/em> option.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Backup to disk has well and truly become entrenched as a core backup strategy in most companies. By &#8220;backup to&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[3,5,16],"tags":[136,153,933],"class_list":["post-2471","post","type-post","status-publish","format-standard","hentry","category-architecture","category-backup-theory","category-networker","tag-b2d","tag-backup-to-disk","tag-staging"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pKpIN-DR","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/2471","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/comments?post=2471"}],"version-history":[{"count":1,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/2471\/revisions"}],"predecessor-version":[{"id":7543,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/2471\/revisions\/7543"}],"wp:attachment":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/media?parent=2471"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/categories?post=2471"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/tags?post=2471"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}