{"id":4882,"date":"2013-07-25T15:50:09","date_gmt":"2013-07-25T05:50:09","guid":{"rendered":"http:\/\/nsrd.info\/blog\/?p=4882"},"modified":"2018-12-11T14:10:49","modified_gmt":"2018-12-11T04:10:49","slug":"data-protection-lessons-from-a-tomato","status":"publish","type":"post","link":"https:\/\/nsrd.info\/blog\/2013\/07\/25\/data-protection-lessons-from-a-tomato\/","title":{"rendered":"Data protection lessons from a tomato"},"content":{"rendered":"<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/tomato.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-4902\" alt=\"Data protection lessons from a tomato\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/tomato.jpg\" width=\"350\" height=\"357\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/tomato.jpg 350w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/tomato-294x300.jpg 294w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/tomato-24x24.jpg 24w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/tomato-36x36.jpg 36w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/tomato-48x48.jpg 48w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/tomato-64x64.jpg 64w\" sizes=\"auto, (max-width: 350px) 100vw, 350px\" \/><\/a><\/p>\n<p>Data protection lessons from a tomato? Have I gone mad?<\/p>\n<p>Bear with me.<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/DIKW.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-4883\" alt=\"DIKW Model\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/DIKW.jpg\" width=\"526\" height=\"424\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/DIKW.jpg 526w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/DIKW-300x241.jpg 300w\" sizes=\"auto, (max-width: 526px) 100vw, 526px\" \/><\/a>If you&#8217;ve done any ITIL training, the above diagram will look familiar to you. Rather unimaginatively, it&#8217;s called the DIKW model:<\/p>\n<blockquote><p>Data &gt; Information &gt; Knowledge &gt; Wisdom<\/p><\/blockquote>\n<p>A simple, practical example of what this diagram\/model means is the following:<\/p>\n<ul>\n<li><strong>Data<\/strong> \u2013 Something is red, and round.<\/li>\n<li><strong>Information<\/strong> \u2013 It\u2019s a tomato.<\/li>\n<li><strong>Knowledge<\/strong> \u2013 Tomato is a fruit.<\/li>\n<li><strong>Wisdom<\/strong> \u2013 You don\u2019t put tomato in a fruit salad.<\/li>\n<\/ul>\n<p>That&#8217;s about as complex as DIKW gets. However, being a rather simple concept, it means it can be used in quite a few areas.<\/p>\n<p>When it comes to data protection, its purpose is obvious:&nbsp;the criticality of the data to business&nbsp;<em>wisdom<\/em> will have a direct impact on the level of protection you need to apply to it.<\/p>\n<p>In this case, I&#8217;m expanding the definition of&nbsp;<em>wisdom<\/em> a little. According to my Apple dashboard dictionary, wisdom is:<\/p>\n<blockquote><p>the quality of having experience, knowledge, and good judgement; the quality of being wise<\/p><\/blockquote>\n<p>Further, we can talk about wisdom in terms of accumulated experience:<\/p>\n<blockquote><p>the body of knowledge and experience that develops within a specified society or period.<\/p><\/blockquote>\n<p>So&nbsp;<em>corporate wisdom<\/em> is about having the experience and knowledge required to act with good judgement, and represents the sum of the knowledge and experience a corporation has built up over time.<\/p>\n<p>If you think about wisdom in terms of&nbsp;<em>corporate wisdom<\/em>, then you&#8217;ll understand my point. For instance, a key database for a company \u2013 or the email system \u2013 represents a tangible chunk of corporate wisdom. Core fileservers will also be pretty far up the scale. It&#8217;s unlikely, on the other hand (in a business with appropriate storage policies) that the files on a regular end-user&#8217;s desktop or laptop will go much beyond&nbsp;<em>information<\/em> in the DIKW scale.<\/p>\n<p>Of course, there are always exceptions. I&#8217;ll get to that in a moment.<\/p>\n<p>What this comes back to pretty quickly is the need for&nbsp;<em>Information Lifecycle Protection<\/em>. End users and the business overall are typically not interested in data \u2013 they&#8217;re interested in information. They don&#8217;t care, as such, about the backup of \/u01\/app\/oracle\/data\/CORPTAX\/data01.dbf \u2013 they care about the corporate tax database. That, of course, means that the IT group and the business need to build service level agreements around&nbsp;<em>business functions<\/em>, not servers and storage. As ITIL teaches, the agreements about networks, storage, servers, etc., come in the form of&nbsp;<em>operational level agreements<\/em> between the segments of IT.<\/p>\n<p>Ironically, years before studying ITIL, it&#8217;s something I covered in my <strong><a title=\"Enterprise Systems Backup and Recovery: A corporate insurance policy\" href=\"http:\/\/www.enterprisesystemsbackup.com\" target=\"_blank\">book<\/a><\/strong> in the notion of establishing system dependency maps:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/System-Maps.jpg\"><img loading=\"lazy\" decoding=\"async\" alt=\"System Maps\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2013\/07\/System-Maps.jpg\" width=\"690\" height=\"762\"><\/a><\/p>\n<p>(In the diagram, the number in parentheses beside a server or function is it&#8217;s reference number; D:X means that it&nbsp;<em>depends<\/em>&nbsp;on the nominated referenced server\/function X.)<\/p>\n<p>What all this boils down to is the criticality of one particular activity when preparing an Information Lifecycle Protection system within an organisation:&nbsp;Data classification. (That of course is where you should catch any of those exceptions I was talking about before.)<\/p>\n<p>In order to properly back something up with the appropriate level of protection and urgency, you need to know <em>what<\/em> it is.<\/p>\n<p>Or, as Stephen Manley <strong><a title=\"Pants tweet\" href=\"https:\/\/twitter.com\/makitadremel\/status\/358672899382591488 \" target=\"_blank\">said the other day<\/a><\/strong>:<\/p>\n<blockquote><p>OH at Starbucks &#8211; 3 page essay ending with &#8216;I now have 5 pairs of pants, not 2. That&#8217;s 3 more.&#8217; Some data may not need to be protected.<\/p><\/blockquote>\n<p><em>Some data may not need to be protected<\/em>. Couldn&#8217;t have said it better myself. Of course, I do&nbsp;<em>also<\/em> say that it&#8217;s better to backup a little bit too much data than not enough, but that&#8217;s&nbsp;<em>not<\/em> something you should see as carte blanche to just backup everything in your environment at all times, regardless of what it is.<\/p>\n<p>The thing about data classification is that most companies do it without first&nbsp;<em>finding<\/em> all their data. The first step, possibly the hardest step, is first becoming aware of the <strong><a title=\"Data Awareness Distribution in the Enterprise\" href=\"https:\/\/nsrd.info\/blog\/2012\/01\/27\/data-awareness-distribution-in-the-enterprise\/\" target=\"_blank\">data distribution<\/a><\/strong> within the enterprise. If you want to skip reading the post linked to in the previous sentence, here&#8217;s the key information from it:<\/p>\n<ul>\n<li>Data \u2013 This is both the core data managed and protected by IT, and all other data throughout the enterprise which is:\n<ul>\n<li><em>Known about<\/em> \u2013 The business is aware of it;<\/li>\n<li><em>Managed<\/em> \u2013 This data falls under the purview of a team in terms of storage administration (ILM);<\/li>\n<li><em>Protected<\/em> \u2013 This data falls under the purview of a team in terms of backup and recovery (ILP).<\/li>\n<\/ul>\n<\/li>\n<li>Dark Data \u2013 To quote [a] previous article, \u201call those bits and pieces of data you\u2019ve got floating around in your environment that <em>aren\u2019t<\/em> fully accounted for\u201d.<\/li>\n<li>Grey Data \u2013 Grey data is previously discovered <em>dark data<\/em> for which no decision has been made as yet in relation to its management or protection. That is, it\u2019s now <em>known<\/em> about, but has not been assigned any policy or tier in either ILM or ILP.<\/li>\n<li>Utility Data \u2013 This is data which is subsequently classified out of <em>grey data<\/em> state into a state where the data is known to have value, but is not either managed or protected, because it can be <em>recreated<\/em>. It could be that the decision is made that the cost (in time) of recreating the data is less expensive than the cost (both in literal dollars and in staff-activity time) of managing and protecting it.<\/li>\n<li>Noise \u2013 This isn\u2019t really data at all, but are all the \u201cbits\u201d (no pun intended) that are left which are neither <em>grey data<\/em>, <em>data<\/em> or <em>utility data<\/em>. In essence, this is irrelevant data, which someone or some group may be keeping for unnecessary reasons, and in actual fact should be considered eligible for either <em>deletion<\/em> or <em>archival and deletion<\/em>.<\/li>\n<\/ul>\n<p>Once you&#8217;ve <em>found<\/em> your data, you can classify it. What&#8217;s structured and unstructured? What&#8217;s the criticality of the data? (I.e., what level of business&nbsp;<em>wisdom<\/em>&nbsp;does it relate to?)<\/p>\n<p>But even then, you&#8217;re not quite ready to determine what your information lifecycle protection policy will be for the data \u2013 well, not until you have a&nbsp;<em>data lifecycle policy<\/em>, which at its simplest, looks something like this:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2011\/01\/Data-Lifecycle.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2661\" alt=\"Data Lifecycle\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2011\/01\/Data-Lifecycle.png\" width=\"439\" height=\"348\" srcset=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2011\/01\/Data-Lifecycle.png 439w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2011\/01\/Data-Lifecycle-300x237.png 300w, https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2011\/01\/Data-Lifecycle-378x300.png 378w\" sizes=\"auto, (max-width: 439px) 100vw, 439px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>Of course, there&#8217;s a lot of time and a lot of decisions bunched up in that diagram, but the lifecycle of data within an organisation is actually that simple at the conceptual level. Or rather, it <em>should<\/em> be. If you want to read more about data lifecycle, <strong><a title=\"A basic data lifecycle\" href=\"https:\/\/nsrd.info\/blog\/2011\/01\/04\/a-basic-data-lifecycle\/\" target=\"_blank\">click here for the intro piece<\/a><\/strong> \u2013 there&#8217;s several accompanying pieces listed at the bottom of the article.<\/p>\n<p>When considered from a&nbsp;<em>backup<\/em> perspective, the end goal of a data lifecycle policy though is simple:<\/p>\n<blockquote><p>Backup only that which needs to be backed up.<\/p><\/blockquote>\n<p>If data can be deleted, delete it.<\/p>\n<p>If data can be archived, archive it.<\/p>\n<p>The logical implication of course is \u2013 if you <em>can&#8217;t<\/em> classify it, if you <em>can&#8217;t<\/em> determine its criticality, then the core backup mantra, <em>&#8220;always better to backup a little bit more than not enough&#8221;<\/em> takes precedence, and you should be working out how to back it up.&nbsp;Obviously, as a fall back rule, it works, but it&#8217;s best to design your overall environment and data policies to avoid it.<\/p>\n<p>So to summarise:<\/p>\n<ol>\n<li><span style=\"line-height: 13px;\">Following the DIKW model, the closer data is to representing corporate wisdom, the more critical its information lifecycle protection requirements will be.<\/span><\/li>\n<li>In order to determine that criticality you first have to&nbsp;<em>find<\/em> the data within your environment.<\/li>\n<li>Once you&#8217;ve found the data in your environment, you have to&nbsp;<em>classify<\/em> it.<\/li>\n<li>Once you&#8217;ve classified it, you can build a data lifecycle policy for it.<\/li>\n<li>And&nbsp;<em>then<\/em> you can configure the appropriate information lifecycle protection for it.<\/li>\n<\/ol>\n<p>If you think back to EMC&#8217;s work towards mitigating the effects of accidental architectures, you&#8217;ll see where I was coming from in talking about the <strong><a title=\"Of accidental architectures\" href=\"https:\/\/nsrd.info\/blog\/2013\/07\/20\/of-accidental-architectures\/\" target=\"_blank\">importance of procedural change<\/a><\/strong> to arrest further accidental architectures. It&#8217;s a classic ER technique \u2013 identify, triage and heal.<\/p>\n<p>And we can learn all this from a tomato, sliced and salted with the DIKW model.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Data protection lessons from a tomato? Have I gone mad? Bear with me. If you&#8217;ve done any ITIL training, the&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[3,5],"tags":[1151,278,455,1152,474,475],"class_list":["post-4882","post","type-post","status-publish","format-standard","hentry","category-architecture","category-backup-theory","tag-data-classification","tag-data-lifecycle","tag-ilp","tag-information-lifecycle","tag-information-lifecycle-management","tag-information-lifecycle-protection"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pKpIN-1gK","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/4882","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/comments?post=4882"}],"version-history":[{"count":21,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/4882\/revisions"}],"predecessor-version":[{"id":7463,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/4882\/revisions\/7463"}],"wp:attachment":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/media?parent=4882"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/categories?post=4882"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/tags?post=4882"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}