{"id":2778,"date":"2011-02-11T16:31:23","date_gmt":"2011-02-11T06:31:23","guid":{"rendered":"http:\/\/nsrd.info\/blog\/?p=2778"},"modified":"2011-02-11T16:31:23","modified_gmt":"2011-02-11T06:31:23","slug":"pumping-data","status":"publish","type":"post","link":"https:\/\/nsrd.info\/blog\/2011\/02\/11\/pumping-data\/","title":{"rendered":"Pumping data"},"content":{"rendered":"<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2011\/02\/pumping-data1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2822\" title=\"Pumping data\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2011\/02\/pumping-data1.jpg\" alt=\"Pumping data\" width=\"600\" height=\"378\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2011\/02\/pumping-data1.jpg 600w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2011\/02\/pumping-data1-300x189.jpg 300w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2011\/02\/pumping-data1-476x300.jpg 476w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<p>The age-old consideration in backup is the most simple one: how to pump the required data through in the required time frame in such a way that it can be readily recovered. This challenges us to constantly find the best way to achieve the data throughput required. What worked 10 years ago was not always applicable 5 years ago; what worked 5 years ago is not always applicable now. Consider for instance the adage:<\/p>\n<blockquote><p>Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.<\/p><\/blockquote>\n<p>(Andrew Tanenbaum, 1996.)<\/p>\n<p>What surprises me, to a degree, is that still, in 2011, we&#8217;re having discussions about data throughput where people focus <em>on the wrong thing<\/em>. I would humbly respect, that you shouldn&#8217;t give a flying fracas about how fast \u00a0you can back your data up when compared to <em>how fast you can recover it<\/em>.<\/p>\n<p>That&#8217;s right: when talking feeds and speeds, the only one to give a damn about in backup is how quickly you can recover the data once it&#8217;s been captured.<\/p>\n<p>This is, in fact, why the terms RPO and RTO were invented. In particular for the topic of &#8220;pumping data&#8221;, RTO \u2013 Recovery Time Objective \u2013 is most important. How quickly do you <em>need<\/em> to get the data back?<\/p>\n<p>In this scenario, Andrew Tanenbaum&#8217;s caution about a station wagon full of tapes hurtling down the highway is entirely appropriate. In fact, so much so that when companies start talking about how fast they need to <em>backup<\/em> (or how fast they <em>can<\/em> backup)\u00a0without reference to recovery, I unfortunately go into this loop:<\/p>\n<p><iframe loading=\"lazy\" title=\"YouTube video player\" width=\"480\" height=\"390\" src=\"http:\/\/www.youtube.com\/embed\/MA5Pjw_cZn0\" frameborder=\"0\" allowfullscreen><\/iframe><\/p>\n<p>Why? Because it&#8217;s like when my grandmother wants to tell me a story about how she bumped into someone she hadn&#8217;t seen for 57 years in the supermarket, but gets stuck on an irrelevant detail. &#8220;Peaches or pears!&#8221; I used to say to her as a kid, perhaps a little disrespectfully \u2013 it didn&#8217;t matter whether she was out shopping for peaches or pears before the important thing happened! Same here \u2013 it doesn&#8217;t matter how fast you can pump data <em>into<\/em> the backup system \u2013 it&#8217;s how fast you can pump data <em>out<\/em> of it that is the only number worth focusing on.<\/p>\n<p>We have to, as storage industry insiders, experts, advisors, consultants \u2013\u00a0whatever we want to call ourselves \u2013\u00a0keep vendors and customers focused on the real important metric: how fast they can recover. We have a duty of care to stand between the FUD and the hype and steer companies on a safe trajectory. The safe trajectory in this case is talking about recovery speeds rather than backup speeds.<\/p>\n<p>This is, for instance, why I rarely get excited about remote office backup strategies. For instance, a current meme in remote office backup strategy is the use of deduplication \u2013 most likely source based. The goal? Reduce the amount of data you have to transfer from the remote office to the head office to a small trickle, and all your problems are solved &#8230; until, of course, you need to recover that data.<\/p>\n<p>Don&#8217;t get me wrong, I&#8217;m not <em>against<\/em> remote office backups \u2013 I&#8217;m also not against <em>centralised<\/em> remote office backups, regardless of whether they&#8217;re achieved by deduplication, compression, magic pixies or faerie dust. In this example though there&#8217;s a simple fact: to talk about remote office backup without discussing remote office recovery is reprehensible.<\/p>\n<p>Yes, reprehensible. I&#8217;ll use that term. It&#8217;s not a nice term, I know, but nor is the practice of ignoring the elephant in the room \u2013 recovery.<\/p>\n<p>Look folks, do you really want me to prance around a stage doing the monkey dance shouting &#8220;Recovery! Recovery! Recovery!&#8221;? Is that what it has to take? Because, if it is, I&#8217;ll do it. (I might, if you don&#8217;t mind, try to avoid the flop sweat though.)<\/p>\n<p>What am I asking for? Maybe it&#8217;s this simple thought:<\/p>\n<blockquote><p>Starting this year, let no company (vendor or otherwise) talk about a product&#8217;s backup performance without citing real world recovery scenarios and performance in those scenarios.<\/p><\/blockquote>\n<p>There is not a guaranteed 1:1 mapping between backup and recovery performance, and to imply there is, either by obfuscation or omission is disrespectful to the data protection industry.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The age-old consideration in backup is the most simple one: how to pump the required data through in the required&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[13,16,19],"tags":[138,732,734,1252],"class_list":["post-2778","post","type-post","status-publish","format-standard","hentry","category-general-thoughts","category-networker","category-recovery","tag-backup","tag-performance","tag-performance-tuning","tag-recovery"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pKpIN-IO","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/2778","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/comments?post=2778"}],"version-history":[{"count":0,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/2778\/revisions"}],"wp:attachment":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/media?parent=2778"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/categories?post=2778"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/tags?post=2778"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}