{"id":5885,"date":"2016-05-24T05:16:10","date_gmt":"2016-05-23T19:16:10","guid":{"rendered":"http:\/\/nsrd.info\/blog\/?p=5885"},"modified":"2018-12-11T10:55:02","modified_gmt":"2018-12-11T00:55:02","slug":"how-many-copies-do-i-need","status":"publish","type":"post","link":"https:\/\/nsrd.info\/blog\/2016\/05\/24\/how-many-copies-do-i-need\/","title":{"rendered":"How many copies do I need?"},"content":{"rendered":"<p>So you&#8217;ve got your primary&nbsp;data stored on one array and it replicates to another array. How many backup copies do you need?<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/2016\/05\/24\/how-many-copies-do-i-need\/bigstock-duplicated-people\/\" rel=\"attachment wp-att-5886\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5886\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/bigStock-Duplicated-People.jpg\" alt=\"Copies\" width=\"900\" height=\"600\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/bigStock-Duplicated-People.jpg 900w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/bigStock-Duplicated-People-300x200.jpg 300w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/bigStock-Duplicated-People-768x512.jpg 768w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/a><\/p>\n<p>There&#8217;s no doubt we&#8217;re spawning more and more copies and pseudo-copies of our data. So much so that EMC&#8217;s new&nbsp;Enterprise Copy Data&nbsp;Management (eCDM) product was&nbsp;announced at EMC World. (For details on that, <a href=\"http:\/\/virtualgeek.typepad.com\/virtual_geek\/2016\/05\/emc-world-2016-emc-copy-data-management.html\" target=\"_blank\">check out Chad&#8217;s blog&nbsp;here<\/a>.)<\/p>\n<p>With many production data sets spawning anywhere between 4 and 10 copies, and sometimes a lot more,&nbsp;a question that gets asked&nbsp;from time to time is:&nbsp;<em>why&nbsp;would I need to duplicate my backups?<\/em><\/p>\n<p>It seems a fair question if you&#8217;re using&nbsp;array to array replication, but let&#8217;s stop for a moment and&nbsp;think about the different types of data protection being applied in this scenario:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/2016\/05\/24\/how-many-copies-do-i-need\/replication-without-cloning\/\" rel=\"attachment wp-att-5888\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-5888\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-without-Cloning-1024x738.jpg\" alt=\"Replication without Cloning\" width=\"695\" height=\"501\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-without-Cloning-1024x738.jpg 1024w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-without-Cloning-300x216.jpg 300w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-without-Cloning-768x554.jpg 768w\" sizes=\"auto, (max-width: 695px) 100vw, 695px\" \/><\/a><\/p>\n<p>Let&#8217;s say we&#8217;ve got two&nbsp;sites, production and disaster recovery, and&nbsp;for the sake of simplicity, a single SAN at each site. The two SANs replicate between one another. Backups are taken&nbsp;at one of the sites \u2013 in this&nbsp;example, the production site. There&#8217;s no duplication of&nbsp;the backups.<\/p>\n<p>Replication is definitely a form of data protection, but&nbsp;its primary purpose is to provide a degree of fault tolerance \u2013 not true fault tolerance of course (that requires more effort), but the idea is that if the&nbsp;primary array is destroyed, there&#8217;s a copy of the data on&nbsp;the secondary array and it can take over&nbsp;production functions.&nbsp;Replication can also factor into maintenance activities \u2013 if you need&nbsp;to repair, update or even replace the primary array, you can failover operations to&nbsp;the secondary array, work on&nbsp;the primary,&nbsp;then fail back when&nbsp;you&#8217;re ready.<\/p>\n<p>In the world of backups there&#8217;s&nbsp;an old saying however: nothing corrupts faster than a mirror. The same applies to replication&#8230;<\/p>\n<blockquote><p>&#8220;Ahah!&#8221;, some interject at this point, &#8220;What if&nbsp;the replication is asynchronous? That means&nbsp;if&nbsp;corruption happens in&nbsp;the source array we can turn off replication between the arrays! Problem solved!&#8221;<\/p>\n<p>Over a decade ago I met an IT manager who felt the response to a virus infecting his network would be to have an operator run into the computer room and use an axe to quickly chop all the network connections away from&nbsp;the core switches.&nbsp;That&nbsp;might actually be more successful than relying on&nbsp;<em>noticing<\/em> corruption ahead of asynchronous replication windows and disconnecting replication links.<\/p><\/blockquote>\n<p>So if there&#8217;s&nbsp;corruption in the primary&nbsp;array that infects the secondary array \u2013&nbsp;that&#8217;s no cause for concern, right? After all there&#8217;s a backup&nbsp;copy&nbsp;sitting there waiting and ready to be used. The answer is simple \u2013 replication&nbsp;isn&#8217;t just for minor types of fault tolerance or being able to switch production during maintenance operations, it&#8217;s also for those really bad disasters, such as something taking out your datacentre.<\/p>\n<p>At this point&nbsp;it&#8217;s common to&nbsp;&#8216;solve&#8217; the problem by moving the backups onto the secondary site (even if they run&nbsp;cross-site), creating a configuration like the following:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/2016\/05\/24\/how-many-copies-do-i-need\/replication-cross-site-backup\/\" rel=\"attachment wp-att-5889\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-5889\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-cross-site-backup-1024x738.jpg\" alt=\"Replication, cross site backup\" width=\"695\" height=\"501\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-cross-site-backup-1024x738.jpg 1024w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-cross-site-backup-300x216.jpg 300w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-cross-site-backup-768x554.jpg 768w\" sizes=\"auto, (max-width: 695px) 100vw, 695px\" \/><\/a><\/p>\n<p>The thinking goes like this: if there&#8217;s a disaster at the primary site, the disaster recovery site not only takes over,&nbsp;but all our backups are there waiting to be used. If there&#8217;s a disaster at the disaster recovery site instead, then&nbsp;no data has been lost because all the&nbsp;data is still sitting on the production array.<\/p>\n<p>Well, in only one very special circumstance:&nbsp;if you only need to keep backups for one day.<\/p>\n<p>Backups&nbsp;typically offer reasonably poor RPO and RTO compared to things like replication, continuous data protection, continuous availability, snapshots, etc. But they&nbsp;<em>do<\/em> offer historical&nbsp;recoverability often essential to meet compliance requirements.&nbsp;Having to provide a modicum of recoverability for 7&nbsp;years is practically&nbsp;the default these days \u2013 medical organisations typically have to retain data for the life of the patient, engineering companies for the lifespan of the&nbsp;construction, and so on. That&#8217;s not&nbsp;<em>all<\/em> backups of course \u2013 depending on your industry you&#8217;ll likely&nbsp;generate your long term backups either from your&nbsp;<em>monthlies<\/em> or your&nbsp;<em>yearlies<\/em>.<\/p>\n<blockquote><p><strong>Aside:&nbsp;<\/strong>The use of backups to facilitate long term retention is&nbsp;a discussion that&#8217;s been running for the 20 years I&#8217;ve been working in data protection, and that will&nbsp;still be going in a decade or more. There are strong, valid arguments for&nbsp;using archive to achieve long term retention, but archive requires a data management policy, something many companies struggle with. Storage got cheap and the perceived cost of doing archive created a strong sense of apathy that we&#8217;re still dealing with today. Do I agree with&nbsp;that apathy? No, but I still have to deal with the reality of the situation.<\/p><\/blockquote>\n<p>So let&#8217;s revisit those failure&nbsp;scenarios again that can happen with&nbsp;off-site backups but no backup duplication:<\/p>\n<ul>\n<li>If there&#8217;s a disaster&nbsp;at&nbsp;the primary site, the disaster recovery site takes over, and all backups are&nbsp;preserved<\/li>\n<li>If there&#8217;s a disaster at the secondary site, the primary site is unaffected but&nbsp;the production replica data&nbsp;<em>and<\/em>&nbsp;all backups are lost: short term operational recovery backups and longer term compliance\/legal retention backups<\/li>\n<\/ul>\n<p>Is that a risk worth taking?&nbsp;I had a friend&nbsp;move&nbsp;interstate recently.&nbsp;The day after he moved in, his&nbsp;neighbour&#8217;s house burnt&nbsp;down. The fire spread to his house and destroyed most of his possessions. He&#8217;d been planning on getting his contents insurance updated the day of the fire.<\/p>\n<p><em>Bad things happen<\/em>. Taking the&nbsp;<em>risk<\/em> that you won&#8217;t lose your secondary site isn&#8217;t really operational planning, it&#8217;s casting your fate to the winds and relying on luck.&nbsp;The solution below though&nbsp;doesn&#8217;t rely on luck at all:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/2016\/05\/24\/how-many-copies-do-i-need\/replication-and-duplicated-backups\/\" rel=\"attachment wp-att-5890\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-5890\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-and-Duplicated-Backups-1024x738.jpg\" alt=\"Replication and Duplicated Backups\" width=\"695\" height=\"501\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-and-Duplicated-Backups-1024x738.jpg 1024w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-and-Duplicated-Backups-300x216.jpg 300w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/05\/Replication-and-Duplicated-Backups-768x554.jpg 768w\" sizes=\"auto, (max-width: 695px) 100vw, 695px\" \/><\/a><\/p>\n<p>There&#8217;s undoubtedly a cost&nbsp;involved; each copy of your data has a tangible cost regardless of whether that&#8217;s a primary copy or a secondary copy. Are there some backups you&nbsp;<em>won&#8217;t<\/em> copy? That depends&nbsp;on your requirements: there may for instance be test systems&nbsp;you need to backup, but there&#8217;s no need to have a secondary copy of them, but such decisions&nbsp;still have to be made on a&nbsp;<em>risk vs cost<\/em> basis.<\/p>\n<p>Replication is all well and good,&nbsp;but it&#8217;s not a get-out-of-gaol card for&nbsp;avoiding cloned backups.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>So you&#8217;ve got your primary&nbsp;data stored on one array and it replicates to another array. How many backup copies do&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[3,5,8],"tags":[256,282,348,767],"class_list":["post-5885","post","type-post","status-publish","format-standard","hentry","category-architecture","category-backup-theory","category-data-loss","tag-copies","tag-data-protection","tag-duplication","tag-protection"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pKpIN-1wV","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/5885","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/comments?post=5885"}],"version-history":[{"count":9,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/5885\/revisions"}],"predecessor-version":[{"id":7412,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/5885\/revisions\/7412"}],"wp:attachment":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/media?parent=5885"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/categories?post=5885"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/tags?post=5885"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}