{"id":6186,"date":"2017-03-22T19:06:18","date_gmt":"2017-03-22T09:06:18","guid":{"rendered":"http:\/\/nsrd.info\/blog\/?p=6186"},"modified":"2018-12-11T09:59:44","modified_gmt":"2018-12-10T23:59:44","slug":"what-queen-can-teach-us-about-long-term-retention","status":"publish","type":"post","link":"https:\/\/nsrd.info\/blog\/2017\/03\/22\/what-queen-can-teach-us-about-long-term-retention\/","title":{"rendered":"What Queen can teach us about long term retention"},"content":{"rendered":"<p>It&#8217;s fair to say I&#8217;m a big fan of Queen.&nbsp;They shaped my life \u2013 the only band to have&nbsp;even a remotely similar effect on me was ELO. (Yes, I&#8217;m an&nbsp;Electric Light Orchestra fan. Seriously, if you haven&#8217;t listened to&nbsp;the Eldorado or Time operatic&nbsp;albums in the dark you haven&#8217;t lived.)<\/p>\n<p>Queen taught me a lot: <a href=\"https:\/\/www.youtube.com\/watch?v=kE8kGMfXaFU\" target=\"_blank\">the emotional perils of travelling at&nbsp;near-relativistic speeds and returning home<\/a>, that <a href=\"https:\/\/www.youtube.com\/watch?v=_Jtpf8N5IDE\" target=\"_blank\">maybe&nbsp;immorality isn&#8217;t&nbsp;what&nbsp;fantasy&nbsp;makes it seem like<\/a>, and, amongst a great many other things, that <a href=\"https:\/\/www.youtube.com\/watch?v=uyd6OLyhPJo&amp;list=RDuyd6OLyhPJo\" target=\"_blank\">you need to take a big leap from time to time to avoid getting stuck in a rut<\/a>.<\/p>\n<p>But you can find more prosaic meanings in Queen, too, if you want to. One of them deals with long term retention. We get <em>that<\/em> lesson from one of the choruses for&nbsp;<em>Too much love will kill you<\/em>:<\/p>\n<blockquote><p>Too much love will kill you,<\/p>\n<p>Just as sure as none at all<\/p><\/blockquote>\n<p>Hang on, you may be asking, what&#8217;s that got to do with long term retention?<\/p>\n<p>Replace&nbsp;&#8216;love&#8217; with &#8216;data&#8217; and you&#8217;ve got it.<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/2016\/11\/24\/my-cup-runneth-over\/green-drink-poured-into-a-glass\/\" rel=\"attachment wp-att-6037\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-6037\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/11\/bigStock-Cup.jpg\" alt=\"Glass\" width=\"601\" height=\"900\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/11\/bigStock-Cup.jpg 601w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2016\/11\/bigStock-Cup-200x300.jpg 200w\" sizes=\"auto, (max-width: 601px) 100vw, 601px\" \/><\/a><\/p>\n<p>I&#8217;m a fan of&nbsp;the saying:<\/p>\n<blockquote><p>It&#8217;s always better to backup a bit too much than not&nbsp;quite enough.<\/p><\/blockquote>\n<p>In fact, it&#8217;s something I mention again in my book, <a href=\"https:\/\/www.crcpress.com\/Data-Protection-Ensuring-Data-Availability\/Guise\/p\/book\/9781482244151\" target=\"_blank\">Data Protection:&nbsp;Ensuring Data Availability<\/a>. Perhaps more than once. (I&#8217;ve&nbsp;mentioned my book before, right? If you like my blog or want to know more about data&nbsp;protection, you should buy the book. I highly recommend it&#8230;)<\/p>\n<p>That&#8217;s&nbsp;something that works quite&nbsp;succinctly for what I&#8217;d call&nbsp;<em>operational backups:&nbsp;<\/em>your short term retention policies. They&#8217;re going to be the backups where you&#8217;re keeping say, weekly fulls and daily incrementals for (typically) between 4-6 weeks for most businesses. For those sorts of backups, you&nbsp;definitely want to err on the side of&nbsp;caution when choosing what to backup.<\/p>\n<p>Now, that&#8217;s not to say you don&#8217;t err on the side of caution when you&#8217;re thinking about long term retention,&nbsp;<em>but<\/em>&nbsp;caution&nbsp;definitely becomes a double-edged sword: the&nbsp;caution&nbsp;of making sure you&#8217;re backing up what you are required to, but also the caution of making sure you&#8217;re not wasting money.<\/p>\n<p>Let&#8217;s start with a simpler example: do you backup your&nbsp;non-production systems? For a lot of environments, the answer is &#8216;yes&#8217; (and that&#8217;s good). So if the answer is &#8216;yes&#8217;, let me ask the follow-up: do you&nbsp;apply the same retention policies for your non-production backups as you do for your production backups? And if the answer to that&nbsp;is &#8216;yes&#8217;, then my final question is this:&nbsp;<em>why?<\/em> Specifically, are you doing it because it&#8217;s (a) habit, (b) what&nbsp;you inherited, or (c) because there&#8217;s a mandated and sensible reason for doing so? My guess is that in 90% of scenarios, the answer is (a) or (b), not (c). That&#8217;s OK, you&#8217;re in the same boat as&nbsp;the rest of the industry.<\/p>\n<p>Let&#8217;s say you have 10TB of production data, and 5TB of non-production data. Not worrying about deduplication for the moment, if you&#8217;re doing weekly fulls and&nbsp;daily incrementals, with a 3.5% daily change (because I want to hurt my brain with mathematics tonight &#8211; trust me, I still count on my fingers, and 3.5 on your fingers is hard) with a 5 week retention period then you&#8217;re generating:<\/p>\n<ul>\n<li>5 x (10+5) TB in full backups<\/li>\n<li>30 x ((10+5) x 0.035) TB in&nbsp;incremental backups<\/li>\n<\/ul>\n<p>That&#8217;s 75 TB (full) + 15.75 TB (incr) of backups generated for 15TB of data over a 5 week period. Yes, we&#8217;ll use&nbsp;deduplication because<a href=\"https:\/\/nsrd.info\/blog\/2017\/03\/13\/networker-usage-report-for-2016\/\" target=\"_blank\"> it&#8217;s so popular with NetWorker<\/a> and shrink that number quite nicely&nbsp;thank-you, but 90.75 TB of&nbsp;logical backups over 5 weeks for 15TB of data is the end number we get at.<\/p>\n<p>But do you really need to generate that many backups? Do you&nbsp;<em>really<\/em> need to keep five weeks worth of non-production backups? What if instead you&#8217;re generating:<\/p>\n<ul>\n<li>5 x 10 TB in full production backups<\/li>\n<li>2 x 5 TB in full non-prod&nbsp;backups<\/li>\n<li>30 x 10 x 0.035 TB in incremental&nbsp;production backups<\/li>\n<li>12 x 5 x 0.035 TB in&nbsp;incremental non-prod backups<\/li>\n<\/ul>\n<p>That becomes 50TB (full prod) + 10 TB (full non-prod) + 10.5 TB (incr prod) + 2.1 TB (incr non-prod)&nbsp;over any 5 week period, or 72.6 TB instead of 90.75 TB \u2013 a saving of 20%.<\/p>\n<p>(If you&#8217;re still&nbsp;pushing your short-term operational backups to tape,&nbsp;your skin is probably crawling at the above suggestion: &#8220;I&#8217;ll need more tape drives!&#8221; Well, yes you would, because tape is inflexible. So using backup to disk means you&nbsp;can start saving on media, because you don&#8217;t need to make sure you have enough tape drives for every potential pool that would be written to at any given time.)<\/p>\n<p>A 20% saving on operational backups for 15TB of data might not sound like a lot, but now let&#8217;s start thinking about long term&nbsp;retention (LTR).<\/p>\n<p>There&#8217;s two particular ways we see long term retention data handled: monthlies&nbsp;kept for the entire LTR period, or keeping monthlies for 12-13 months and just keeping end-of-calendar-year (EoCY) + end-of-financial-year (EoFY) for the LTR period. I&#8217;d&nbsp;<em>suggest<\/em> that the&nbsp;<em>knee-jerk<\/em> reaction by many businesses is&nbsp;to keep monthlies for the entire time.&nbsp;That doesn&#8217;t necessarily&nbsp;have to be the case though \u2013 and this is the sort of thing that should also be investigated: do you&nbsp;<em>legally<\/em> need to keep all&nbsp;your monthly backups for your LTR, or do you&nbsp;just need to keep those EoCY and EoFY backups for that period? That alone might be a huge saving.<\/p>\n<p>Let&#8217;s assume though that you&#8217;re keeping those monthly backups for your entire&nbsp;LTR period. We&#8217;ll assume you&#8217;re also not in engineering, where you need to keep records for the lifetime of the product, or biosciences, where you need to keep records for the lifetime of the patient (and longer), and just stick with the tried-and-trusted 7 year&nbsp;retention period seen almost everywhere.<\/p>\n<p>For LTR, we also&nbsp;have to consider yearly growth. I&#8217;m going to cheat and assume 10% year on year growth, but the growth only kicks in once a year. (In reality for many businesses it&#8217;s more like a true compound annual growth, ammortized monthly, which does change things around a bit.)<\/p>\n<p>So let&#8217;s go back to&nbsp;those numbers. We&#8217;ve already&nbsp;established what we need for&nbsp;operational backups, but what do we need for LTR?<\/p>\n<p>If we&#8217;re&nbsp;not differentiating between prod and non-prod (and believe me, that&#8217;s common for&nbsp;LTR), then our numbers look like this:<\/p>\n<ul>\n<li>Year 1: 12 x 15 TB<\/li>\n<li>Year 2: 12 x 16.5 TB<\/li>\n<li>Year 3: 12 x 18.15 TB<\/li>\n<li>Year 4: 12 x&nbsp;19.965 TB<\/li>\n<li>Year 5: 12 x&nbsp;21.9615 TB<\/li>\n<li>Year 6: 12 x&nbsp;24.15765 TB<\/li>\n<li>Year 7: 12 x&nbsp;26.573415 TB<\/li>\n<\/ul>\n<p>Total? 1,707.69 TB of LTR for a 7 year period. (And&nbsp;even as data ages out, that will still grow as the YoY growth continues.)<\/p>\n<p>But again,&nbsp;<em>do you need to keep non-prod backups for LTR<\/em>? What if we didn&#8217;t \u2013 what would those numbers look like?<\/p>\n<ul>\n<li>Year 1: 12 x 10 TB<\/li>\n<li>Year 2: 12 x 11 TB<\/li>\n<li>Year 3: 12 x 12.1&nbsp;TB<\/li>\n<li>Year 4: 12 x 13.31&nbsp;TB<\/li>\n<li>Year 5: 12 x&nbsp;14.641 TB<\/li>\n<li>Year 6: 12 x 16.1051 TB<\/li>\n<li>Year 7: 12 &nbsp;17.71561 TB<\/li>\n<\/ul>\n<p>That comes down to just 1,138 TB over 7 years \u2013 a 33% saving in LTR storage.<\/p>\n<p>We got that saving just by looking at splitting off non-production&nbsp;data from production data for our retention policies. What if&nbsp;we were to do&nbsp;more? Do you really need to keep&nbsp;<em>all<\/em> of your production data for&nbsp;an entire&nbsp;7-year LTR period? If&nbsp;we&#8217;re talking a typical organisation looking at 7 year retention periods, we&#8217;re usually only talking about&nbsp;critical systems that face compliance requirements \u2013 maybe some&nbsp;financial databases, one section of a fileserver, and email. What if that was just 1 TB of the production data? (I&#8217;d suggest that for many companies,&nbsp;a&nbsp;guesstimate of 10% of production data being the data required \u2013 legally required \u2013 for compliance&nbsp;retention is pretty accurate.)<\/p>\n<p>Well then your LTR data requirements would be just 113.85 TB over 7 years, and that&#8217;s a saving of <span style=\"text-decoration: underline;\"><strong>93%<\/strong><\/span> of LTR&nbsp;storage requirements (pre-deduplication) over a 7 year period for an initial 15 TB of data.<\/p>\n<p>I&#8217;m all for backing up a little bit too much than not enough, but once we start looking at LTR, we have&nbsp;to take that adage with a grain of salt. (I&#8217;ll suggest that in my experience, it&#8217;s&nbsp;something that&nbsp;locks a lot of companies into&nbsp;using tape for LTR.)<\/p>\n<blockquote><p>Too much data will kill you,<\/p>\n<p>Just as sure as none at all<\/p><\/blockquote>\n<p><em>That&#8217;s<\/em> the lesson we get from Queen for LTR.<\/p>\n<p>&#8230;Now if you&#8217;ll&nbsp;excuse me,&nbsp;now I&#8217;ve talked a bit about Queen, I need to go and listen to their greatest song of all time, <a href=\"https:\/\/www.youtube.com\/watch?v=SoBMhx_ap_g\" target=\"_blank\">March of the Black Queen<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It&#8217;s fair to say I&#8217;m a big fan of Queen.&nbsp;They shaped my life \u2013 the only band to have&nbsp;even a&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[3,5],"tags":[240,1350,1348,1349],"class_list":["post-6186","post","type-post","status-publish","format-standard","hentry","category-architecture","category-backup-theory","tag-compliance","tag-elo","tag-long-term-retention","tag-queen"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pKpIN-1BM","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/6186","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/comments?post=6186"}],"version-history":[{"count":9,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/6186\/revisions"}],"predecessor-version":[{"id":7394,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/6186\/revisions\/7394"}],"wp:attachment":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/media?parent=6186"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/categories?post=6186"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/tags?post=6186"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}