{"id":3534,"date":"2012-01-27T16:40:00","date_gmt":"2012-01-27T06:40:00","guid":{"rendered":"http:\/\/nsrd.info\/blog\/?p=3534"},"modified":"2018-12-11T14:40:30","modified_gmt":"2018-12-11T04:40:30","slug":"data-awareness-distribution-in-the-enterprise","status":"publish","type":"post","link":"https:\/\/nsrd.info\/blog\/2012\/01\/27\/data-awareness-distribution-in-the-enterprise\/","title":{"rendered":"Data Awareness Distribution in the Enterprise"},"content":{"rendered":"<p>Continuing on my post relating to <em><a title=\"Dark Data\" href=\"https:\/\/nsrd.info\/blog\/2012\/01\/22\/dark-data\/\" target=\"_blank\">dark data<\/a><\/em>&nbsp;last week, I want to spend a little more about data awareness classification and distribution within an enterprise environment.<\/p>\n<p>Dark data isn&#8217;t the end of the story, and it&#8217;s time to introduce the entire family of data-awareness concepts. These are:<\/p>\n<ul>\n<li><strong>Data<\/strong> \u2013 This is both the core data managed and protected by IT, and all other data throughout the enterprise which is:<\/li>\n<ul>\n<li><em>Known about<\/em> \u2013 The business is aware of it;<\/li>\n<li><em>Managed<\/em> \u2013 This data falls under the purview of a team in terms of storage administration (ILM);<\/li>\n<li><em>Protected<\/em> \u2013 This data falls under the purview of a team in terms of backup and recovery (ILP).<\/li>\n<\/ul>\n<li><strong>Dark Data<\/strong> \u2013 To quote the previous article, &#8220;all those bits and pieces of data you\u2019ve got floating around in your environment that&nbsp;<em>aren\u2019t<\/em>&nbsp;fully accounted for&#8221;.<\/li>\n<li><strong>Grey Data<\/strong> \u2013 Grey data is previously discovered <em>dark data<\/em>&nbsp;for which no decision has been made as yet in relation to its management or protection. That is, it&#8217;s now <em>known<\/em>&nbsp;about, but has not been assigned any policy or tier in either ILM or ILP.<\/li>\n<li><strong>Utility Data<\/strong> \u2013 This is data which is subsequently classified out of <em>grey data<\/em>&nbsp;state into a state where the data is known to have value, but is not either managed or protected, because it can be <em>recreated<\/em>. It could be that the decision is made that the cost (in time) of recreating the data is less expensive than the cost (both in literal dollars and in staff-activity time) of managing and protecting it.<\/li>\n<li><strong>Noise<\/strong> \u2013 This isn&#8217;t really data at all, but are all the &#8220;bits&#8221; (no pun intended) that are left which are neither <em>grey data<\/em>, <em>data<\/em>&nbsp;or <em>utility data<\/em>. In essence, this is irrelevant data, which someone or some group may be keeping for unnecessary reasons, and in actual fact should be considered eligible for either&nbsp;<em>deletion <\/em>or<em> archival and deletion.<\/em><\/li>\n<\/ul>\n<p>The distribution of data by awareness within the enterprise may resemble something along the following lines:<\/p>\n<p><a href=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2012\/01\/Data-Awareness-Percentage-Scenario.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3536\" title=\"Data Awareness Percentage Distribution\" src=\"https:\/\/nsrd.info\/blog\/wp-content\/uploads\/2012\/01\/Data-Awareness-Percentage-Scenario.png\" alt=\"Data Awareness Percentage Distribution\" width=\"259\" height=\"299\"><\/a><\/p>\n<p>That is, ideally the largest percentage of data <em>should<\/em>&nbsp;be regular data which is <em>known<\/em>, <em>managed<\/em> and <em>protected<\/em>. In all likelihood for most organisations, the next biggest percentage of data is going to be <em>dark data<\/em>&nbsp;\u2013 the data that hasn&#8217;t been discovered yet. Ideally however, after <em>regular<\/em> and <em>dark<\/em>&nbsp;data have been removed from the distribution, there should be at most 20% of data left, and this should be broken up such that at least half of that remaining data is <em>utility<\/em>&nbsp;data, with the last 10% split evenly between grey data and noise.<\/p>\n<p>The logical implications of this layout should be reasonably straight forward:<\/p>\n<ol>\n<li>At all times the majority of data within an organisation should be <em>known<\/em>, <em>managed<\/em>&nbsp;and <em>protected<\/em>.<\/li>\n<li>It should be expected that at least 20% of the data within an organisation is undiscovered, or decentralised.<\/li>\n<li>Once data is discovered, it should exist in a &#8216;grey&#8217; state for a very short period of time; ideally it should be reclassified as soon as possible into <em>data<\/em>, <em>utility data<\/em>&nbsp;or <em>noise<\/em>. In particular, data left in a grey state for an extended period of time represents just as dangerous a potential data loss situation as dark data.<\/li>\n<\/ol>\n<p>It should be noted that regular data, even in this awareness classification scheme, will still be subject to regular data lifecycle decisions (archive, tiering, deletion, etc.) In that sense, primary data eligible for deletion isn&#8217;t really noise, because it&#8217;s <em>previously<\/em>&nbsp;been managed and protected; <em>noise<\/em>&nbsp;really is ex dark-data that will end up being deleted, either as an explicit decision, or due to a failure at some future point after the decision to classify it as &#8216;noise&#8217;, having <em>never<\/em>&nbsp;been managed or protected in a centralised, coordinated manner.<\/p>\n<p>Equally, <em>utility<\/em>&nbsp;data won&#8217;t refer to say, Q\/A or test databases that replicate the content of production databases. These types of databases will again have fallen under the standard <em>data<\/em>&nbsp;umbrella in that there will have been information lifecycle management and protection policies established for them, regardless of what those policies actually were.<\/p>\n<p>If we bring this back to roles, then it&#8217;s clear that a pivotal role of both the DPAs (Data Protection Advocates) and the IPAC (Information Protection Advisory Council) within an organisation should be the rapid coordination of classification of dark data as it is discovered into one of the <em>data<\/em>, <em>utility data<\/em>&nbsp;or <em>noise<\/em>&nbsp;states.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Continuing on my post relating to dark data&nbsp;last week, I want to spend a little more about data awareness classification&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[3,5,12,13],"tags":[270,271,342,412,496,639,939,1065],"class_list":["post-3534","post","type-post","status-publish","format-standard","hentry","category-architecture","category-backup-theory","category-general-technology","category-general-thoughts","tag-dark-data","tag-data","tag-dpa","tag-grey-data","tag-ipac","tag-noise","tag-storage","tag-utility-data"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pKpIN-V0","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/3534","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/comments?post=3534"}],"version-history":[{"count":1,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/3534\/revisions"}],"predecessor-version":[{"id":7488,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/posts\/3534\/revisions\/7488"}],"wp:attachment":[{"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/media?parent=3534"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/categories?post=3534"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nsrd.info\/blog\/wp-json\/wp\/v2\/tags?post=3534"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}